[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7. Tabular sources (RFC 1345)

An important part of the tabular charset knowledge in recode comes from RFC 1345 or, alternatively, from the chset tools, both maintained by Keld Simonsen. The RFC 1345 document:

"Character Mnemonics & Character Sets", K. Simonsen, Request for Comments no. 1345, Network Working Group, June 1992.

defines many character mnemonics and character sets. The recode library implements most of RFC 1345, however:

Keld Simonsen keld@dkuug.dk did most of RFC 1345 himself, with some funding from Danish Standards and Nordic standards (INSTA) project. He also did the character set design work, with substantial input from Olle Jaernefors. Keld typed in almost all of the tables, some have been contributed. A number of people have checked the tables in various ways. The RFC lists a number of people who helped.

Keld and the recode maintainer have an arrangement by which any new discovered information submitted by recode users, about tabular charsets, is forwarded to Keld, eventually merged into Keld's work, and only then, reimported into recode. Neither the recode program nor its library try to compete, nor even establish themselves as an alternate or diverging reference: RFC 1345 and its new drafts stay the genuine source for most tabular information conveyed by recode. Keld has been more than collaborative so far, so there is no reason that we act otherwise. In a word, recode should be perceived as the application of external references, but not as a reference in itself.

Internally, RFC 1345 associates which each character an unambiguous mnemonic of a few characters, taken from ISO 646, which is a minimal ASCII subset of 83 characters. The charset made up by these mnemonics is available in recode under the name RFC1345. It has mnemonic and 1345 for aliases. As implemened, this charset exactly corresponds to mnemonic+ascii+38, using RFC 1345 nomenclature. Roughly said, ISO 646 characters represent themselves, except for the ampersand (&) which appears doubled. A prefix of a single ampersand introduces a mnemonic. For mnemonics using two characters, the prefix is immediately by the mnemonic. For longer mnemonics, the prefix is followed by an underline (_), the mmemonic, and another underline. Conversions to this charset are usually reversible.

Currently, recode does not offer any of the many other possible variations of this family of representations. They will likely be implemented in some future version, however.

ANSI_X3.4-1968
367, ANSI_X3.4-1986, ASCII, CP367, IBM367, ISO646-US, ISO_646.irv:1991, US-ASCII, iso-ir-6 and us are aliases for this charset. Source: ISO 2375 registry.

ASMO_449
ISO_9036, arabic7 and iso-ir-89 are aliases for this charset. Source: ISO 2375 registry.

BS_4730
ISO646-GB, gb, iso-ir-4 and uk are aliases for this charset. Source: ISO 2375 registry.

BS_viewdata
iso-ir-47 is an alias for this charset. Source: ISO 2375 registry.

CP1250
1250, ms-ee and windows-1250 are aliases for this charset. Source: UNICODE 1.0.

CP1251
1251, ms-cyrl and windows-1251 are aliases for this charset. Source: UNICODE 1.0.

CP1252
1252, ms-ansi and windows-1252 are aliases for this charset. Source: UNICODE 1.0.

CP1253
1253, ms-greek and windows-1253 are aliases for this charset. Source: UNICODE 1.0.

CP1254
1254, ms-turk and windows-1254 are aliases for this charset. Source: UNICODE 1.0.

CP1255
1255, ms-hebr and windows-1255 are aliases for this charset. Source: UNICODE 1.0.

CP1256
1256, ms-arab and windows-1256 are aliases for this charset. Source: UNICODE 1.0.

CP1257
1257, WinBaltRim and windows-1257 are aliases for this charset. Source: CEN/TC304 N283.

CSA_Z243.4-1985-1
ISO646-CA, ca, csa7-1 and iso-ir-121 are aliases for this charset. Source: ISO 2375 registry.

CSA_Z243.4-1985-2
ISO646-CA2, csa7-2 and iso-ir-122 are aliases for this charset. Source: ISO 2375 registry.

CSA_Z243.4-1985-gr
iso-ir-123 is an alias for this charset. Source: ISO 2375 registry.

CSN_369103
KOI-8_L2, iso-ir-139 and koi8l2 are aliases for this charset. Source: ISO 2375 registry.

CWI
CWI-2 and cp-hu are aliases for this charset. Source: Computerworld Sza'mita'stechnika vol 3 issue 13 1988-06-29.

DEC-MCS
dec is an alias for this charset. VAX/VMS User's Manual, Order Number: AI-Y517A-TE, April 1986.

DIN_66003
ISO646-DE, de and iso-ir-21 are aliases for this charset. Source: ISO 2375 registry.

DS_2089
DS2089, ISO646-DK and dk are aliases for this charset. Source: Danish Standard, DS 2089, February 1974.

EBCDIC-AT-DE
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-AT-DE-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-CA-FR
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-DK-NO
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-DK-NO-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES-S
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FI-SE
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FI-SE-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FR
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-IS-FRISS
friss is an alias for this charset. Source: Skyrsuvelar Rikisins og Reykjavikurborgar, feb 1982.

EBCDIC-IT
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-PT
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-UK
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-US
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

ECMA-cyrillic
ECMA-113, ECMA-113:1986 and iso-ir-111 are aliases for this charset. Source: ISO 2375 registry.

ES
ISO646-ES and iso-ir-17 are aliases for this charset. Source: ISO 2375 registry.

ES2
ISO646-ES2 and iso-ir-85 are aliases for this charset. Source: ISO 2375 registry.

GB_1988-80
ISO646-CN, cn and iso-ir-57 are aliases for this charset. Source: ISO 2375 registry.

GOST_19768-87
ST_SEV_358-88 and iso-ir-153 are aliases for this charset. Source: ISO 2375 registry.

IBM037
037, CP037, ebcdic-cp-ca, ebcdic-cp-nl, ebcdic-cp-us and ebcdic-cp-wt are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM038
038, CP038 and EBCDIC-INT are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM1004
1004, CP1004 and os2latin1 are aliases for this charset. Source: CEN/TC304 N283, 1994-02-04.

IBM1026
1026 and CP1026 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM1047
1047 and CP1047 are aliases for this charset. Source: IBM Character Data Representation Architecture. Registry SC09-1391-00 p 150.

IBM256
256, CP256 and EBCDIC-INT1 are aliases for this charset. Source: IBM Registry C-H 3-3220-050.

IBM273
273 and CP273 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM274
274, CP274 and EBCDIC-BE are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM275
275, CP275 and EBCDIC-BR are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM277
EBCDIC-CP-DK and EBCDIC-CP-NO are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM278
278, CP278, ebcdic-cp-fi and ebcdic-cp-se are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM280
280, CP280 and ebcdic-cp-it are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM281
281, CP281 and EBCDIC-JP-E are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM284
284, CP284 and ebcdic-cp-es are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM285
285, CP285 and ebcdic-cp-gb are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM290
290, CP290 and EBCDIC-JP-kana are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM297
297, CP297 and ebcdic-cp-fr are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM420
420, CP420 and ebcdic-cp-ar1 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990. IBM NLS RM p 11-11.

IBM423
423, CP423 and ebcdic-cp-gr are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM424
424, CP424 and ebcdic-cp-he are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM437
437 and CP437 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM500
500, 500V1, CP500, ebcdic-cp-be and ebcdic-cp-ch are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM850
850 and CP850 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990. Source: UNICODE 1.0.

IBM851
851 and CP851 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM852
852, CP852, pcl2 and pclatin2 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM855
855 and CP855 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM857
857 and CP857 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM860
860 and CP860 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM861
861, CP861 and cp-is are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM862
862 and CP862 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM863
863 and CP863 are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM864
864 and CP864 are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM865
865 and CP865 are aliases for this charset. Source: IBM DOS 3.3 Ref (Abridged), 94X9575 (Feb 1987).

IBM868
868, CP868 and cp-ar are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM869
869, CP869 and cp-gr are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM870
870, CP870, ebcdic-cp-roece and ebcdic-cp-yu are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM871
871, CP871 and ebcdic-cp-is are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM875
875, CP875 and EBCDIC-Greek are aliases for this charset. Source: UNICODE 1.0.

IBM880
880, CP880 and EBCDIC-Cyrillic are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM891
891 and CP891 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM903
903 and CP903 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM904
904 and CP904 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM905
905, CP905 and ebcdic-cp-tr are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM918
918, CP918 and ebcdic-cp-ar2 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IEC_P27-1
iso-ir-143 is an alias for this charset. Source: ISO 2375 registry.

INIS
iso-ir-49 is an alias for this charset. Source: ISO 2375 registry.

INIS-8
iso-ir-50 is an alias for this charset. Source: ISO 2375 registry.

INIS-cyrillic
iso-ir-51 is an alias for this charset. Source: ISO 2375 registry.

INVARIANT
iso-ir-170 is an alias for this charset.

ISO-8859-1
819, CP819, IBM819, ISO8859-1, ISO_8859-1, ISO_8859-1:1987, iso-ir-100, l1 and latin1 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-10
ISO8859-10, ISO_8859-10, ISO_8859-10:1993, L6, iso-ir-157 and latin6 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-13
ISO8859-13, ISO_8859-13, ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7 and latin7 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-14
ISO8859-14, ISO_8859-14, ISO_8859-14:1998, iso-celtic, iso-ir-199, l8 and latin8 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-15
ISO8859-15, ISO_8859-15, ISO_8859-15:1998, iso-ir-203, l9 and latin9 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-2
912, CP912, IBM912, ISO8859-2, ISO_8859-2, ISO_8859-2:1987, iso-ir-101, l2 and latin2 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-3
ISO8859-3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3 and latin3 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-4
ISO8859-4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4 and latin4 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-5
ISO8859-5, ISO_8859-5, ISO_8859-5:1988, cyrillic and iso-ir-144 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-6
ASMO-708, ECMA-114, ISO8859-6, ISO_8859-6, ISO_8859-6:1987, arabic and iso-ir-127 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-7
ECMA-118, ELOT_928, ISO8859-7, ISO_8859-7, ISO_8859-7:1987, greek, greek8 and iso-ir-126 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-8
ISO8859-8, ISO_8859-8, ISO_8859-8:1988, hebrew and iso-ir-138 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-9
ISO8859-9, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5 and latin5 are aliases for this charset. Source: ISO 2375 registry.

ISO_10367-box
iso-ir-155 is an alias for this charset. Source: ISO 2375 registry.

ISO_2033-1983
e13b and iso-ir-98 are aliases for this charset. Source: ISO 2375 registry.

ISO_5427
iso-ir-37 is an alias for this charset. Source: ISO 2375 registry.

ISO_5427-ext
ISO_5427:1981 and iso-ir-54 are aliases for this charset. Source: ISO 2375 registry.

ISO_5428
ISO_5428:1980 and iso-ir-55 are aliases for this charset. Source: ISO 2375 registry.

ISO_646.basic
ISO_646.basic:1983 and ref are aliases for this charset. Source: ISO 2375 registry.

ISO_646.irv
ISO_646.irv:1983, irv and iso-ir-2 are aliases for this charset. Source: ISO 2375 registry.

ISO_6937-2-25
iso-ir-152 is an alias for this charset. Source: ISO 2375 registry.

ISO_8859-supp
iso-ir-154 and latin1-2-5 are aliases for this charset. Source: ISO 2375 registry.

IT
ISO646-IT and iso-ir-15 are aliases for this charset. Source: ISO 2375 registry.

JIS_C6220-1969-jp
JIS_C6220-1969, iso-ir-13, katakana and x0201-7 are aliases for this charset. Source: ISO 2375 registry.

JIS_C6220-1969-ro
ISO646-JP, iso-ir-14 and jp are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-a
jp-ocr-a is an alias for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-b
ISO646-JP-OCR-B and jp-ocr-b are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-b-add
iso-ir-93 and jp-ocr-b-add are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-hand
iso-ir-94 and jp-ocr-hand are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-hand-add
iso-ir-95 and jp-ocr-hand-add are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-kana
iso-ir-96 is an alias for this charset. Source: ISO 2375 registry.

JIS_X0201
X0201 is an alias for this charset.

JUS_I.B1.002
ISO646-YU, iso-ir-141, js and yu are aliases for this charset. Source: ISO 2375 registry.

JUS_I.B1.003-mac
iso-ir-147 and macedonian are aliases for this charset. Source: ISO 2375 registry.

JUS_I.B1.003-serb
iso-ir-146 and serbian are aliases for this charset. Source: ISO 2375 registry.

KOI-7
Source: Andrey A. Chernov <ache@nagual.pp.ru>.

KOI-8
GOST_19768-74 is an alias for this charset. Source: Andrey A. Chernov <ache@nagual.pp.ru>.

KOI8-R
Source: RFC1489 via Gabor Kiss <kissg@sztaki.hu>. And Andrey A. Chernov <ache@nagual.pp.ru>.

KOI8-RU
Source: http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/.

KOI8-U
Source: RFC 2319. Mibenum: 2088. Source: http://www.net.ua/KOI8-U/.

KSC5636
ISO646-KR is an alias for this charset.

Latin-greek-1
iso-ir-27 is an alias for this charset. Source: ISO 2375 registry.

MSZ_7795.3
ISO646-HU, hu and iso-ir-86 are aliases for this charset. Source: ISO 2375 registry.

NATS-DANO
iso-ir-9-1 is an alias for this charset. Source: ISO 2375 registry.

NATS-DANO-ADD
iso-ir-9-2 is an alias for this charset. Source: ISO 2375 registry.

NATS-SEFI
iso-ir-8-1 is an alias for this charset. Source: ISO 2375 registry.

NATS-SEFI-ADD
iso-ir-8-2 is an alias for this charset. Source: ISO 2375 registry.

NC_NC00-10
ISO646-CU, NC_NC00-10:81, cuba and iso-ir-151 are aliases for this charset. Source: ISO 2375 registry.

NF_Z_62-010
ISO646-FR, fr and iso-ir-69 are aliases for this charset. Source: ISO 2375 registry.

NF_Z_62-010_(1973)
ISO646-FR1 and iso-ir-25 are aliases for this charset. Source: ISO 2375 registry.

NS_4551-1
ISO646-NO, iso-ir-60 and no are aliases for this charset. Source: ISO 2375 registry.

NS_4551-2
ISO646-NO2, iso-ir-61 and no2 are aliases for this charset. Source: ISO 2375 registry.

NeXTSTEP
next is an alias for this charset. Source: Peter Svanberg - psv@nada.kth.se.

PT
ISO646-PT and iso-ir-16 are aliases for this charset. Source: ISO 2375 registry.

PT2
ISO646-PT2 and iso-ir-84 are aliases for this charset. Source: ISO 2375 registry.

SEN_850200_B
FI, ISO646-FI, ISO646-SE, SS636127, iso-ir-10 and se are aliases for this charset. Source: ISO 2375 registry.

SEN_850200_C
ISO646-SE2, iso-ir-11 and se2 are aliases for this charset. Source: ISO 2375 registry.

T.61-7bit
iso-ir-102 is an alias for this charset. Source: ISO 2375 registry.

baltic
iso-ir-179 is an alias for this charset. Source: ISO 2375 registry. &g1esc x2d56 &g2esc x2e56 &g3esc x2f56.

greek-ccitt
iso-ir-150 is an alias for this charset. Source: ISO 2375 registry.

greek7
iso-ir-88 is an alias for this charset. Source: ISO 2375 registry.

greek7-old
iso-ir-18 is an alias for this charset. Source: ISO 2375 registry.

hp-roman8
r8 and roman8 are aliases for this charset. Source: LaserJet IIP Printer User's Manual,. HP part no 33471-90901, Hewlet-Packard, June 1989.

latin-greek
iso-ir-19 is an alias for this charset. Source: ISO 2375 registry.

mac-is

macintosh
mac is an alias for this charset. Source: The Unicode Standard ver 1.0, ISBN 0-201-56788-1, Oct 1991.

macintosh_ce
macce is an alias for this charset. Source: Macintosh CE fonts.

sami
iso-ir-158, lap and latin-lap are aliases for this charset. Source: ISO 2375 registry.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by root on June, 20 2004 using texi2html