![]() |
![]() |
![]() |
General Information
Tutorials
Reference Manuals
Libraries
Translation Tasks
Tools
Administration
![]() |
![]() |
Lexical AnalysisCanned Symbol DescriptionsFor many applications, the exact structure of the symbols that must be recognized is not important or the problem description specifies that the symbols should be the same as the symbols used in some other situation (e.g. identifiers might be specified to use the same format as C identifiers). To cover this common situation, Eli provides a set of canned symbol descriptions.To use a canned description, simply write the canned description's identifier in a specification instead of writing a regular expression. For example, the following type-`gla' file tells Eli that the input text will contain C-style identifiers and strings, Ada-style comments, and Pascal-style integers:
Identifier: C_IDENTIFIER ADA_COMMENT String: C_STRING_LIT Integer: PASCAL_INTEGER
The available canned descriptions are defined later in this section.
All of these definitions include a regular expression, and some include
auxiliary scanners and/or token processors.
An auxiliary scanner or token processor specified by a canned description
can be overridden by nominating a different one in the specification that
names the canned description.
For example, the canned description
Str: PASCAL_STRING [mkidn]
The auxiliary scanner The remainder of this section characterizes the canned descriptions that are available in the Eli library, and also gives their definitions.
Available DescriptionsEach of the identifiers in the following list is the name of a canned description specifying the lexical structure of some component of an existing programming language. Here they are simply characterized by the role they play in that language. A complete definition of each, consisting of a regular expression, possibly an auxiliary scanner name, and possibly a token processor name, is given in the next section. When building a new language, it is a good idea to use canned descriptions for lexical components: Time is not wasted in deciding on their form, mistakes are not made in their implementation, and users are familiar with them. The list also provides canned descriptions for spaces, tabs and newlines. These white space characters are treated as comments by default. If, however, you define any pattern that will accept a white space character in its first position, this pattern overrides the default treatment and that white space character will be accepted only in contexts that are specified explicitly (see Spaces, Tabs and Newlines). For example, suppose that the following pattern were defined and that no other patterns contain spaces:
Separator: $\040+#\040+
In that situation, a space will be accepted only if it is part of a
Separator: $\040+#\040+ SPACES Note that only a white space character that appears at the beginning of a pattern loses its default interpretation in this way. In this example, neither the tab nor the newline appeared at the beginning of a pattern and therefore tabs and newlines continue to be treated as comments.
Definitions of Canned DescriptionsEli textually replaces a reference to a canned description with its definition. If a user nominates an auxiliary scanner and/or a token processor for a canned description, that overrides the corresponding nomination appearing in the definition of the canned description.
The following is an alphabetized list of the canned descriptions
available in the Eli library, with their definitions.
Use this list as a formal definition, and as an example for constructing
specifications.
(
|