ANTLR 2.7.1
Release NotesOctober 1, 2000
The ANTLR 2.7.1 release is a bug fix
release, brought to you by those hip cats at jGuru.com.
One of the bug fixes, however, allows UNICODE characters to be recognized for the
first time. :)
Enhancements
ANTLR 2.7.1 has a few enhancements:
- ANTLR now allows UNICODE characters because Terence made
case-statement expressions more efficient ;) See the unicode example in the
distribution and the brief blurb in the documentation.
- Massively improved C++ code generator (see below).
- Added automatic column setting support. See updated doc and new examples/java/columns directory.
- Ter added throws to tree and regular parsers .
- Added an antlr/extras directory, currently containing only antlr-emacs.el by Christoph.Wedler@sap-ag.de. Thanks,
Christoph!
C++ Code Generation
Pete Wells and Ric Klaren have pretty much gutted the C++
code generator to use templates and so on. Here are few notes (with
lib/cpp/Changelog having more goodies). Ric has totally worked his ass off to make
the C++ what it is now! :)
Enhancements to C++ code generator for:
- * #line generation for easier debugging of action code. Turn
on/off
with option genHashLines (grammar option).
- * Cleaner generated code, by providing options to specify
namespace
prefixes. Grammar options namespaceAntlr and namespaceStd can
be set to "antlr::" and "std::" or to blank if your compiler
doesn't support namespaces.
- * Generate comments to explain what the bitsets represent.
- * Fix bug with -traceTreeParser code.
- * Avoid warnings about unused variable _saveIndex.
- * Remove final, illegal comma in token types enum.
Enhancements to C++ support library for:
- * Performance enhancements. Thanks to several people for
suggestions/patches here. Improvements to memory management for
building strings, and buffering of tokens.
- * Support for Metrowerks Codewarrior, and Sun CC 5.0.
- * Fix problem with multi-threaded lexers using static
variable.
- * Slight tidy up (more planned).
Additionally, there have been enhancements made to the C++
side to mirror the Java side changes.
Ric Klaren (2.7.1a3 C++ changes) says:
- - action.g allow ':' in ID rule so C++ namespace qualifiers
work.
- - CppCodegernator the '::' fix for namespaceXXX options. As
requested by Michael Schmitt.
- - Several cleanups in the Exception classes (basically a
hoisting of code) and one or two new constructors with more line/column param's.
- - Default value for column in LexerInputState to 1. as
suggested by someone on the list.. (name I would have to look up)
- - A makefile for the C++ lib directory. Not yet the autoconf
stuff posted by someone (whose name I would also have to look up) it would imply a bigger
workover of the lib/cpp directory. Which is harder to do with diff's.
- - Some changes I made after enabling the effective C++
warnings on g++ (minor drivle basically.. in most places not really needed)
- - Several virtuals added to methods. Based on a suggestion
also by Ernest Pasour. It makes the error messages from the thrown exceptions a lot better
Bug Fixes
In no particular order, here are the improvements/fixes made
to 2.7.0 to arrive at 2.7.1 (via 2.7.1a1..a4):
- columns started at 0 for line 1. fixed.
- Bob McWhirter added -o fix so that antlr looks for import
vocab stuff in -o director if not found in $CWD (current working directory).
- Added optimization so that large unicode ranges don't result
in giant switch case expressions. For example, added charVocabulary='\u0003'..'\uffff' to
java.g. Took antlr 24s to generate 51k lexer file vs 9sec without. New 2.7.1 did it with
big vocab in 14 sec. Oh, and the interesting thing is that with the big vocab and new
optimization, it's actually smaller than with vocab set to ASCII. :)
- added a build script.
- Robert Colquhoun rjc@trump.net.au
gave me a patch to pull stuff out of Tool.java that was causing it to be required for
runtime even.
- Jerry James (james@eecs.ukans.edu)
gave me a patch to make the labels for heterogeneous tree nodes match the specific AST
type rather than plain AST.
- ANTLR didn't like curlies in quotes (preproc.g was hosed).
It now parses:
class A extends Parser;
tokens {
// hi |}
/*
fds
*}*/
TOK_LBRACE="{";
TOK_RBRACE="}";
}
a : "{" B "}";
- Fixed C++ code generator to allow ~(Z|G)
- Parser.getInputState called setInputState.
- ANTLR now allows comments between header, options, and tokens
and then '{' now. Examples:
options //fdkjfds
{
k = 1;
}
tokens //testing
{
A = "a";
}
- Made fields of CommonToken protected (open to subclasses),
added col. added column tracking support; tabs are counted as 1 unless you override tab().
Called from consume(); bumps by one by default. Overhead is minimal; only called on tabs.
extra increment for all consume()s now extra int in CommonToken.
/**
advance the current column number by an appropriate amount. If you do not override
this to specify how much to jump for a tab, then tabs are counted as one char. This method
is called from consume().
*/
public void tab() {
// update inputState.column as function of
// inputState.column and tab stops.
// For example, if tab stops are columns 1
// and 5 etc... and column is 3, then add 2
// to column.
inputState.column++;
}
- added CharScanner.setColumn
- warnings were going to stdout, make go to stderr.
- added check for unterminated rules. Labels in column 1 result
in a warning.
- wasn't providing always exactly 4 digits for \u chars in
JavaCharFormatter.escapeChar.
- Fixed that nasty follow cycle grammar analysis bug Tom Moog
and others found.
- C++: CharScanner.cpp toLower, changed arg from char to int.
- added column support to C++ output
- Sather fixes put in, brought up to snuff with Java/C++.
- ANTLR continued on after discovering duplicate grammar.
'caused later exception.
- Bug fix: $setType( w ); didn't work because of the leading
space.
- For the java.tree.g grammar: the NEW operator didn't allow an
optional (objBlock)?
- HTML: Added lots of tweaks to html.g, Made blockquote handle
nested content. Fixed bug in COMMENT_DATA that wouldn't let '-' appear in comment.
Made COMMENT scarf WS after comment
- Added to runtime jars (bigger but too lazy to weed out
unnecessary var refs that force inclusion):
antlr/DefineGrammarSymbols.class
antlr/ANTLRGrammarParseBehavior.class
antlr/MakeGrammar.class
antlr/ANTLRParser.class
antlr/ANTLRTokenTypes
antlr/LLkGrammarAnalyzer
antlr/GrammarAnalyzer
public CommonASTWithHiddenTokens() {
super();
}
public CommonASTWithHiddenTokens(Token tok) {
super(tok);
}
ANTLR Installation
ANTLR comes as a single zip or compressed tar file. Unzipping
the file you receive will produce a directory called antlr-2.7.1 with
subdirectories antlr, doc, examples, cpp, and examples.cpp. You
need to place the antlr-2.7.1 directory in your CLASSPATH environment
variable. For example, if you placed antlr-2.7.1 in directory /tools,
you need to append
/tools/antlr-2.7.1
to your CLASSPATH or.
\tools\antlr-2.7.1
if you work on an NT or Win95 box.
References to antlr.* will map to /tools/antlr-2.7.1/antlr/*.class.
You must have at least JDK 1.1 installed properly on your
machine. The ASTFrame AST viewer uses Swing 1.1.
JAR FILE
Try using the runtime library antlr.jar file. Place it in your CLASSPATH
instead of the antlr-2.7.1 directory. The jar includes all parse-time
files needed (if it is missing a file, email parrt@jguru.com)
You cannot run the antlr tool itself with the jar, but your parsers should run with just
this jar file. It's pretty small, around 75k uncompressed.
RUNNING ANTLR
ANTLR is a command line tool (although many development
environments let you run ANTLR on grammar files from within the environment). The main
method within antlr.Tool is the ANTLR entry point.
java antlr.Tool file.g
The command-line option is -diagnostic, which generates a
text file for each output parser class that describes the lookahead sets. Note that there
are number of options that you can specify at the grammar class and rule level.
Options -trace, -traceParser, -traceTreeParser may be used to
track the lexer, parser, and tree parser invocations.
Try the new -html option to generate HTML output of your
grammar(s); this is only partially done.
If you have trouble running ANTLR, ensure that you have Java
installed correctly and then ensure that you have the appropriate CLASSPATH set.
Version: $Id: //depot/code/org.antlr/release/antlr-2.7.1/doc/antlr271release.html#2 $
|