ICANN Logo
Collation Registry

Collation Registry for Internet Application Protocols

[RFC4790] creates an abstraction framework so that application protocols can precisely identify a comparison function and the repertoire of comparison functions can be extended in the future. This document defines an IANA maintained registry of collations for comparing, searching and sorting international strings.

Registration Procedures:  New collations are added to this registry through a process of expert review. Proposals for new collations are to be formated using the template defined in: [RFC4790] and sent to iana&iana.org. Documents are then passed to the designated expert for review.

The following is the list of comparators:

Collation Description Reference
i;ascii-numeric The "i;ascii-numeric" collation is a simple collation intended for use with arbitrary sized unsigned decimal integer numbers stored as octet strings. US-ASCII digits (0x30 to 0x39) represent digits of the numbers. Before converting from string to integer, the input string is truncated at the first non-digit character. All input is valid; strings which do not start with a digit represent positive infinity. RFC4790
i;ascii-casemap The "i;ascii-casemap" collation is a simple collation which operates on octet strings and treats US-ASCII letters case-insensitively. It provides equality, substring and ordering operations. All input is valid. Note that letters outside ASCII are not treated case- insensitively. RFC4790
i;octet The "i;octet" collation is a simple and fast collation intended for use on binary octet strings rather than on character data. Protocols that want to make this collation available have to do so by explicitly allowing it. If not explicitly allowed, it MUST NOT be used. It never returns an "undefined" result. It provides equality, substring and ordering operations. RFC4790
i;unicode-casemap

The "i;unicode-casemap" collation is a simple collation which is case-insensitive in its treatment of characters. It provides equality, substring, and ordering operations. The validity test operation returns "valid" for any input.

This collation allows strings in arbitrary (and mixed) character sets, as long as the character set for each string is identified and it is possible to convert the string to Unicode. Strings which have an unidentified character set and/or cannot be converted to Unicode are not rejected, but are treated as binary.

RFC5051

 


Comments concerning the layout, construction and functionality of this site
should be sent to webmaster&iana.org.

Page updated 2007-10-23
(c) 1999-2007 The Internet Corporation for Assigned Names and Numbers All rights reserved.