The Perl INTERCAL compiler
... Character Sets
Normally, the compiler requires the program source to be in EBCDIC, although
there are compiler options to translate from ASCII or Baudot. Since there isn't
such thing as a standard EBCDIC, we have designed our own non-standard one.
The principle is simple: for each character, we selected a code which was
used for that character by at least one IBM terminal. However, to guarantee
incompatibility, our set differs in at least one character from any IBM
hardware for which we have been able to find documentation.
Here's the character table:
+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f |
00 | | | | | | | | | BSP | TAB | LF | | | CR | | |
10 | | | | | | | | | | | | | | | | |
20 | | | | | | | | | | | | | | | | |
30 | | | | | | | | | | | | | | | | |
40 | SP | | | | | | | | | | ¢ | . | < | ( | + | ! |
50 | & | | | | | | | | | | ] | $ | * | ) | ; | ¬ |
60 | - | / | | | | xor | | | | | | | , | % | _ | > | ? |
70 | | | | | | | | | | | : | # | @ | ' | = | " |
80 | | a | b | c | d | e | f | g | h | i | | | | | | |
90 | | j | k | l | m | n | o | p | q | r | | | { | | [ | |
a0 | | ~ | s | t | u | v | w | x | y | z | | | | | | ® |
b0 | ^ | £ | | | © | | | | | | | | | | | |
c0 | | A | B | C | D | E | F | G | H | I | | | | | | |
d0 | | J | K | L | M | N | O | P | Q | R | | | } | | | |
e0 | | | S | T | U | V | W | X | Y | Z | | | | | | |
f0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | | | DEL |
While the compiler and runtime accept ASCII and EBCDIC for input/output,
internally everything is represented in extended Baudot. The "letters"
and "figures" sets are identical to the standard Baudot, but we have a
nonstandard convention that shifting to letters while already in letters
causes a shift to lowercase letters, and shifting to figures while
already in figures causes a shift to a set containing special characters.
Thus to guarantee uppercase letters one woule first shift to figures and
then to letters, for example. If this extended Baudot is sent to a
teletype which understands standard Baudot, the result will be a text in
ALL CAPS with some of the symbols it cannot print replaced with others
it can.
Here's the character table:
Code | Uppercase | Lowercase | Figures | Symbols |
00 | Invalid code |
01 | E | e | 3 | ¢ |
02 | Line Feed |
03 | A | a | - | + |
04 | Space |
05 | S | s | Bell | \ |
06 | I | i | 8 | # |
07 | U | u | 7 | = |
08 | Carriage Return |
09 | D | d | $ | * |
10 | R | r | 4 | { |
11 | J | j | ' | ~ |
12 | N | n | , | xor |
13 | F | f | ! | | |
14 | C | c | : | ^ |
15 | K | k | ( | < |
16 | T | t | 5 | [ |
17 | Z | z | " | } |
18 | W | w | ) | > |
19 | L | l | 2 | ] |
20 | H | h | Invalid | backspace |
21 | Y | y | 6 | @ |
22 | P | p | 0 | Invalid |
23 | Q | q | 1 | £ |
24 | O | o | 9 | ¬ |
25 | B | b | ? | delete |
26 | G | g | & | Invalid |
27 | Figures | Symbols |
28 | M | m | . | % |
29 | X | x | / | _ |
30 | V | v | ; | Invalid |
31 | Lowercase | Uppercase |
CLC-INTERCAL 1.-94 introduces support for the "Hollerith" character
set, for compatibility with punched card devices and similar. A column in
a punched card corresponds to 12 bits, so tail registers can store one
character per element (with 4 bits wasted); similarly, a Hollerith file
requires two bytes per character. The first byte contains punch lines
12, 0, 2, 4, 6, 8; the second byte contains lines 11, 1, 3, 5, 7, 9. The
12 bit number corresponding to one column is therefore the interleave
of the two bytes. The two most significant bits in each bytes are ignored;
when producing Hollerith, CLC-INTERCAL will clear bit 7 and set bit 6 to
the complement of bit 5: the result will be printable on an ASCII
terminal, although it is unlikely to be easy to read.
The Hollerith encoding used by CLC-INTERCAL is an extension of one of
the many character sets used for punched cards; lowercase are added by
overpunching the corresponding uppercase character with a single extra
hole. Some extra characters useful for INTERCAL programs have also
been added.
Overpunches, where two different characters are punched on the same column
are fully supported: when converting from Hollerith to another character
set, these may result in sequences of characters.
The following three cards summarise the encoding. The third card shows
two examples of overpunch and some control characters which do not
exist in real punched cards (a carriage return, newline sequence would
correspond to changing card, other control characters are meaningless
for punched cards, but are useful when storing virtual punched cards
in a file).
| ' | | ! | " | # | $ | % | & | ( | ) | * | + | , | - | . | / | : | ; | < | = | > | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ~ | ¢ | ¥ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
12 | | | | * | | | | * | | * | | * | | | * | | | | | | * | | | | | * | | * | | | | * | | * | | | | | | | | | | | | 12 |
11 | | | | | | * | | | | | * | | | * | | | | | * | | * | * | | | | | * | * | | | * | | * | | * | | | | | | | | | | | 11 |
0 | | | * | | | | * | | * | | | | * | | | * | * | * | * | | | | | * | | | | | | * | | | | * | * | * | | | | | | | | | | 0 |
1 | | | | | | | | | | | | | | | | * | | | | | | | | | | | | | | | | | | * | | | * | | | | | | | | | 1 |
2 | * | | | * | | | * | | | | | | | | | | | | | | | * | | | | | | | | | | | | | | | | * | | | | | | | | 2 |
3 | | | | | * | * | | | | | | | * | | * | | | | | | | | | | | | | | | | | | | * | | | | | * | | | | | | | 3 |
4 | | | | | | | | | * | * | * | | | | | | | | * | | * | | * | * | | * | | | | * | | * | | | | | | | | * | | | | | | 4 |
5 | | | | | | | | * | | | | | | | | | * | | | * | | | | | | | | | | | * | | | | * | | | | | | * | | | | | 5 |
6 | | | | | | | | | | | | | | | | | | * | | | | | | | | | * | | * | * | | * | | | | | | | | | | * | | | | 6 |
7 | | | * | | | | | | | | | | | | | | | | | | | | | * | * | * | | | | | | | * | | | | | | | | | | * | | | 7 |
8 | * | | | * | * | * | * | * | * | * | * | | * | | * | | * | * | * | * | * | * | * | | * | | * | | * | | * | | | | | | | | | | | | | * | | 8 |
9 | | | * | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * | | | | | | | | | | | | * | 9 |
|
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | |
12 | * | * | * | * | * | * | * | * | * | | | | | | | | | | | | | | | | | | * | * | * | * | * | * | * | * | * | | | | | | | | | | | | | | | | | | 12 |
11 | | | | | | | | | | * | * | * | * | * | * | * | * | * | | | | | | | | | | | | | | | | | | * | * | * | * | * | * | * | * | * | | | | | | | | | 11 |
0 | | | | | | | | | | | | | | | | | | | * | * | * | * | * | * | * | * | * | | | | | | | | | * | | | | | | | | | * | * | * | * | * | * | * | * | 0 |
1 | * | | | | | | | | | * | | | | | | | | | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | | * | | | | | | | | 1 |
2 | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | | | | | | 2 |
3 | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | | | | | 3 |
4 | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | | | | 4 |
5 | | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | | | 5 |
6 | | | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | | 6 |
7 | | | | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | | 7 |
8 | | | | | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | * | | | | | | | | * | * | | | | | | | * | * | 8 |
9 | | | | | | | | | * | | | | | | | | | * | | | | | | | | * | | | | | | | | | * | | | | | | | | | * | | | | | | | | * | 9 |
|
| [] | ". | NL | CR | HT | |
12 | * | * | * | | | 12 |
11 | | | | * | | 11 |
0 | * | | | | * | 0 |
1 | | | * | * | * | 1 |
2 | | * | * | * | * | 2 |
3 | | * | * | * | * | 3 |
4 | * | | * | * | * | 4 |
5 | | | * | * | * | 5 |
6 | | | * | * | * | 6 |
7 | * | | * | * | * | 7 |
8 | | * | * | * | * | 8 |
9 | | | * | * | * | 9 |
|
Back