#include "unicode/utypes.h"
#include "unicode/ustring.h"
Go to the source code of this file.
Defines | |
#define | U_TITLECASE_NO_LOWERCASE 0x100 |
Do not lowercase non-initial parts of words when titlecasing. | |
#define | U_TITLECASE_NO_BREAK_ADJUSTMENT 0x200 |
Do not adjust the titlecasing indexes from BreakIterator::next() indexes; titlecase exactly the characters at breaks from the iterator. | |
Typedefs | |
typedef UCaseMap | UCaseMap |
C typedef for struct UCaseMap. | |
Functions | |
UCaseMap * | ucasemap_open (const char *locale, uint32_t options, UErrorCode *pErrorCode) |
Open a UCaseMap service object for a locale and a set of options. | |
void | ucasemap_close (UCaseMap *csm) |
Close a UCaseMap service object. | |
const char * | ucasemap_getLocale (const UCaseMap *csm) |
Get the locale ID that is used for language-dependent case mappings. | |
uint32_t | ucasemap_getOptions (const UCaseMap *csm) |
Get the options bit set that is used for case folding and string comparisons. | |
void | ucasemap_setLocale (UCaseMap *csm, const char *locale, UErrorCode *pErrorCode) |
Set the locale ID that is used for language-dependent case mappings. | |
void | ucasemap_setOptions (UCaseMap *csm, uint32_t options, UErrorCode *pErrorCode) |
Set the options bit set that is used for case folding and string comparisons. | |
const UBreakIterator * | ucasemap_getBreakIterator (const UCaseMap *csm) |
Get the break iterator that is used for titlecasing. | |
void | ucasemap_setBreakIterator (UCaseMap *csm, UBreakIterator *iterToAdopt, UErrorCode *pErrorCode) |
Set the break iterator that is used for titlecasing. | |
int32_t | ucasemap_toTitle (UCaseMap *csm, UChar *dest, int32_t destCapacity, const UChar *src, int32_t srcLength, UErrorCode *pErrorCode) |
Titlecase a UTF-16 string. | |
int32_t | ucasemap_utf8ToLower (const UCaseMap *csm, char *dest, int32_t destCapacity, const char *src, int32_t srcLength, UErrorCode *pErrorCode) |
Lowercase the characters in a UTF-8 string. | |
int32_t | ucasemap_utf8ToUpper (const UCaseMap *csm, char *dest, int32_t destCapacity, const char *src, int32_t srcLength, UErrorCode *pErrorCode) |
Uppercase the characters in a UTF-8 string. | |
int32_t | ucasemap_utf8ToTitle (UCaseMap *csm, char *dest, int32_t destCapacity, const char *src, int32_t srcLength, UErrorCode *pErrorCode) |
Titlecase a UTF-8 string. | |
int32_t | ucasemap_utf8FoldCase (const UCaseMap *csm, char *dest, int32_t destCapacity, const char *src, int32_t srcLength, UErrorCode *pErrorCode) |
Case-fold the characters in a UTF-8 string. |
The service object takes care of memory allocations, data loading, and setup for the attributes, as usual.
Currently, the functionality provided here does not overlap with uchar.h and ustring.h, except for ucasemap_toTitle().
ucasemap_utf8XYZ() functions operate directly on UTF-8 strings.
Definition in file ucasemap.h.
|
Do not adjust the titlecasing indexes from BreakIterator::next() indexes; titlecase exactly the characters at breaks from the iterator. Option bit for titlecasing APIs that take an options bit set. By default, titlecasing will take each break iterator index, adjust it by looking for the next cased character, and titlecase that one. Other characters are lowercased. This follows Unicode 4 & 5 section 3.13 Default Case Operations: R3 toTitlecase(X): Find the word boundaries based on Unicode Standard Annex #29, "Text Boundaries." Between each pair of word boundaries, find the first cased character F. If F exists, map F to default_title(F); then map each subsequent character C to default_lower(C).
Definition at line 166 of file ucasemap.h. |
|
Do not lowercase non-initial parts of words when titlecasing. Option bit for titlecasing APIs that take an options bit set. By default, titlecasing will titlecase the first cased character of a word and lowercase all other characters. With this option, the other characters will not be modified.
Definition at line 141 of file ucasemap.h. |
|
C typedef for struct UCaseMap.
Definition at line 44 of file ucasemap.h. |
|
Close a UCaseMap service object.
|
|
Get the break iterator that is used for titlecasing. Do not modify the returned break iterator.
|
|
Get the locale ID that is used for language-dependent case mappings.
|
|
Get the options bit set that is used for case folding and string comparisons.
|
|
Open a UCaseMap service object for a locale and a set of options. The locale ID and options are preprocessed so that functions using the service object need not process them in each call.
|
|
Set the break iterator that is used for titlecasing. The UCaseMap service object releases a previously set break iterator and "adopts" this new one, taking ownership of it. It will be released in a subsequent call to ucasemap_setBreakIterator() or ucasemap_close(). Break iterator operations are not thread-safe. Therefore, titlecasing functions use non-const UCaseMap objects. It is not possible to titlecase strings concurrently using the same UCaseMap.
|
|
Set the locale ID that is used for language-dependent case mappings.
|
|
Set the options bit set that is used for case folding and string comparisons.
|
|
Titlecase a UTF-16 string. This function is almost a duplicate of u_strToTitle(), except that it takes ucasemap_setOptions() into account and has performance advantages from being able to use a UCaseMap object for multiple case mapping operations, saving setup time. Casing is locale-dependent and context-sensitive. Titlecasing uses a break iterator to find the first characters of words that are to be titlecased. It titlecases those characters and lowercases all others. (This can be modified with ucasemap_setOptions().) The titlecase break iterator can be provided to customize for arbitrary styles, using rules and dictionaries beyond the standard iterators. It may be more efficient to always provide an iterator to avoid opening and closing one for each string. The standard titlecase iterator for the root locale implements the algorithm of Unicode TR 21. This function uses only the setText(), first() and next() methods of the provided break iterator. The result may be longer or shorter than the original. The source string and the destination buffer must not overlap.
|
|
Case-fold the characters in a UTF-8 string. Case-folding is locale-independent and not context-sensitive, but there is an option for whether to include or exclude mappings for dotted I and dotless i that are marked with 'I' in CaseFolding.txt. The result may be longer or shorter than the original. The source string and the destination buffer must not overlap.
|
|
Lowercase the characters in a UTF-8 string. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. The source string and the destination buffer must not overlap.
|
|
Titlecase a UTF-8 string. Casing is locale-dependent and context-sensitive. Titlecasing uses a break iterator to find the first characters of words that are to be titlecased. It titlecases those characters and lowercases all others. (This can be modified with ucasemap_setOptions().) The titlecase break iterator can be provided to customize for arbitrary styles, using rules and dictionaries beyond the standard iterators. It may be more efficient to always provide an iterator to avoid opening and closing one for each string. The standard titlecase iterator for the root locale implements the algorithm of Unicode TR 21. This function uses only the setText(), first() and next() methods of the provided break iterator. The result may be longer or shorter than the original. The source string and the destination buffer must not overlap.
|
|
Uppercase the characters in a UTF-8 string. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. The source string and the destination buffer must not overlap.
|