Puma Reference Manual Puma: Puma::Syntax Class Reference



Puma::Syntax Class Reference

#include <Puma/Syntax.h>

Inheritance diagram for Puma::Syntax:

Inheritance graph

List of all members.


Detailed Description

Syntactic analysis base class.

Implements the top-down parsing algorithm (recursive descend parser). To be derived to implement parsers for specific grammars. Provides infinite look-ahead.

This class uses a tree builder object (see Builder) to create the syntax tree, and a semantic analysis object (see Semantic) to perform required semantic analyses of the parsed code.

The parse process is started by calling Syntax::run() with a token provider as argument. Using the token provider this method reads the first core language token from the input source code and tries to parse it by applying the top grammar rule.

 return parse(&Puma::Syntax::trans_unit) ? builder().Top() : (Puma::CTree*)0; 

The top grammar rule has to be provided by reimplementing method Syntax::trans_unit(). It may call sub-rules according to the implemented language-specific grammar. Example:

 Puma::CTree *MySyntax::trans_unit() {
   return parse(&MySyntax::block_seq) ? builder().block_seq() : (Puma::CTree*)0;
 }

For context-sensitive grammars it may be necessary in the rules of the grammar to perform first semantic analyses of the parsed code (to differentiate ambigous syntactic constructs, resolve names, detect errors, and so one). Example:

 Puma::CTree *MySyntax::block() {
   // '{' instruction instruction ... '}'
   if (parse(TOK_OPEN_CURLY)) {             // parse '{'
     semantic().enter_block();              // enter block scope
     seq(&MySyntax::instruction);           // parse sequence of instructions
     semantic().leave_block();              // leave block scope
     if (parse(TOK_CLOSE_CURLY)) {          // parse '}'
       return builder().block();            // build syntax tree for the block
     }
   }
   return (CTree*)0;                        // rule failed
 }

If a rule could be parsed successfully the tree builder is used to create a CTree based syntax tree (fragment) for the parsed rule. Failing grammar rules shall return NULL. The result of the top grammar rule is the root node of the abstract syntax tree for the whole input source code.

Public Member Functions

CTreerun (TokenProvider &tp)
 Start the parse process.
template<class T>
CTreerun (TokenProvider &tp, CTree *(T::*rule)())
 Start the parse process at a specific grammar rule.
virtual void configure (Config &c)
 Configure the syntactic analysis object.
TokenProviderprovider () const
 Get the token provider from which the parsed tokens are read.
Tokenproblem () const
 Get the last token that could not be parsed.
bool error () const
 Check if errors occured during the parse process.
bool look_ahead (int token_type, unsigned n=1)
 Look-ahead n core language tokens and check if the n-th token has the given type.
bool look_ahead (int *token_types, unsigned n=1)
 Look-ahead n core language tokens and check if the n-th token has one of the given types.
int look_ahead () const
 Look-ahead one core language token.
bool consume ()
 Consume all tokens until the next core language token.

Public Attributes

TokenProvidertoken_provider
 Token provider for getting the tokens to parse.

Protected Member Functions

 Syntax (Builder &b, Semantic &s)
 Constructor.
virtual ~Syntax ()
 Destructor.
template<class T>
bool parse (CTree *(T::*rule)())
 Parse the given grammar rule.
template<class T>
bool seq (CTree *(T::*rule)())
 Parse a sequence of the given grammar rule.
template<class T>
bool seq (bool(T::*rule)())
 Parse a sequence of the given grammar rule.
template<class T>
bool list (CTree *(T::*rule)(), int separator, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs.
template<class T>
bool list (CTree *(T::*rule)(), int *separators, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs.
template<class T>
bool list (bool(T::*rule)(), int separator, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs.
template<class T>
bool list (bool(T::*rule)(), int *separators, bool trailing_separator=false)
 Parse a sequence of rule-separator pairs.
template<class T>
bool catch_error (CTree *(T::*rule)(), const char *msg, int *finish_tokens, int *skip_tokens)
 Parse a grammar rule automatically catching parse errors.
bool parse (int token_type)
 Parse a token with the given type.
bool parse (int *token_types)
 Parse a token with one of the given types.
bool parse_token (int token_type)
 Parse a token with the given type.
bool opt (bool dummy) const
 Optional rule parsing.
Builderbuilder () const
 Get the syntax tree builder.
Semanticsemantic () const
 Get the semantic analysis object.
virtual CTreetrans_unit ()
 Top parse rule to be reimplemented for a specific grammar.
virtual void handle_directive ()
 Handle a compiler directive token.
State save_state ()
 Save the current parser state.
void forget_state ()
 Forget the saved parser state.
void restore_state ()
 Restore the saved parser state.
void restore_state (State state)
 Restore the saved parser state to the given state.
void set_state (State state)
 Overwrite the parser state with the given state.
bool accept (CTree *tree, State state)
 Accept the given syntax tree node.
Tokenlocate_token ()
 Skip all non-core language tokens until the next core-language token is read.
void skip ()
 Skip the current token.
void skip_block (int start, int end)
 Skip all tokens between start and end, including start and end token.
void skip_curly_block ()
 Skip all tokens between '{' and '}', including '{' and '}'.
void skip_round_block ()
 Skip all tokens between '(' and ')', including '(' and ')'.
void parse_block (int start, int end)
 Parse all tokens between start and end, including start and end token.
void parse_curly_block ()
 Parse all tokens between '{' and '}', including '{' and '}'.
void parse_round_block ()
 Parse all tokens between '(' and ')', including '(' and ')'.
bool skip (int stop_token, bool inclusive=true)
 Skip all tokens until a token with the given type is read.
bool skip (int *stop_tokens, bool inclusive=true)
 Skip all tokens until a token with one of the given types is read.
bool is_in (int token_type, int *token_types) const
 Check if the given token type is in the set of given token types.

Classes

class  State
 Parser state, the current position in the token stream. More...


Constructor & Destructor Documentation

Puma::Syntax::Syntax ( Builder b,
Semantic s 
) [inline, protected]

Constructor.

Parameters:
b The syntax tree builder.
s The semantic analysis object.

virtual Puma::Syntax::~Syntax (  )  [inline, protected, virtual]

Destructor.


Member Function Documentation

CTree* Puma::Syntax::run ( TokenProvider tp  ) 

Start the parse process.

Parameters:
tp The token provider from where to get the tokens to parse.
Returns:
The resulting syntax tree.

template<class T>
CTree * Puma::Syntax::run ( TokenProvider tp,
CTree *(T::*)()  rule 
) [inline]

Start the parse process at a specific grammar rule.

Parameters:
tp The token provider from where to get the tokens to parse.
rule The grammar rule where to start.
Returns:
The resulting syntax tree.

virtual void Puma::Syntax::configure ( Config c  )  [inline, virtual]

Configure the syntactic analysis object.

Parameters:
c The configuration object.

Reimplemented in Puma::CCSyntax, and Puma::CSyntax.

TokenProvider* Puma::Syntax::provider (  )  const [inline]

Get the token provider from which the parsed tokens are read.

Token * Puma::Syntax::problem (  )  const [inline]

Get the last token that could not be parsed.

bool Puma::Syntax::error (  )  const [inline]

Check if errors occured during the parse process.

bool Puma::Syntax::look_ahead ( int  token_type,
unsigned  n = 1 
)

Look-ahead n core language tokens and check if the n-th token has the given type.

Parameters:
token_type The type of the n-th token.
n The number of tokens to look-ahead.
Returns:
True if the n-th token has the given type.

bool Puma::Syntax::look_ahead ( int *  token_types,
unsigned  n = 1 
)

Look-ahead n core language tokens and check if the n-th token has one of the given types.

Parameters:
token_types The possible types of the n-th token.
n The number of tokens to look-ahead.
Returns:
True if the n-th token has one of the given types.

int Puma::Syntax::look_ahead (  )  const [inline]

Look-ahead one core language token.

Returns:
The type of the next core language token.

bool Puma::Syntax::consume (  )  [inline]

Consume all tokens until the next core language token.

template<class T>
bool Puma::Syntax::parse ( CTree *(T::*)()  rule  )  [inline, protected]

Parse the given grammar rule.

Saves the current state of the builder, semantic, and token provider objects.

Parameters:
rule The rule to parse.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::seq ( CTree *(T::*)()  rule  )  [inline, protected]

Parse a sequence of the given grammar rule.

Parameters:
rule The rule to parse at least once.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::seq ( bool(T::*)()  rule  )  [inline, protected]

Parse a sequence of the given grammar rule.

Parameters:
rule The rule to parse at least once.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::list ( CTree *(T::*)()  rule,
int  separator,
bool  trailing_separator = false 
) [inline, protected]

Parse a sequence of rule-separator pairs.

Parameters:
rule The rule to parse at least once.
separator The separator token.
trailing_separator True if a trailing separator token is allowed.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::list ( CTree *(T::*)()  rule,
int *  separators,
bool  trailing_separator = false 
) [inline, protected]

Parse a sequence of rule-separator pairs.

Parameters:
rule The rule to parse at least once.
separators The separator tokens.
trailing_separator True if a trailing separator token is allowed.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::list ( bool(T::*)()  rule,
int  separator,
bool  trailing_separator = false 
) [inline, protected]

Parse a sequence of rule-separator pairs.

Parameters:
rule The rule to parse at least once.
separator The separator token.
trailing_separator True if a trailing separator token is allowed.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::list ( bool(T::*)()  rule,
int *  separators,
bool  trailing_separator = false 
) [inline, protected]

Parse a sequence of rule-separator pairs.

Parameters:
rule The rule to parse at least once.
separators The separator tokens.
trailing_separator True if a trailing separator token is allowed.
Returns:
True if parsed successfully.

template<class T>
bool Puma::Syntax::catch_error ( CTree *(T::*)()  rule,
const char *  msg,
int *  finish_tokens,
int *  skip_tokens 
) [inline, protected]

Parse a grammar rule automatically catching parse errors.

Parameters:
rule The rule to parse.
msg The error message to show if the rule fails.
finish_tokens Set of token types that abort parsing the rule.
skip_tokens If the rule fails skip all tokens until a token is read that has one of the types given here.
Returns:
False if at EOF or a finish_token is read, true otherwise.

bool Puma::Syntax::parse ( int  token_type  )  [inline, protected]

Parse a token with the given type.

Parameters:
token_type The token type.
Returns:
True a corresponding token was parsed.

bool Puma::Syntax::parse ( int *  token_types  )  [protected]

Parse a token with one of the given types.

Parameters:
token_types The token types.
Returns:
True a corresponding token was parsed.

bool Puma::Syntax::parse_token ( int  token_type  )  [protected]

Parse a token with the given type.

Parameters:
token_type The token type.
Returns:
True a corresponding token was parsed.

bool Puma::Syntax::opt ( bool  dummy  )  const [inline, protected]

Optional rule parsing.

Always succeeds regardless of the argument.

Parameters:
dummy Dummy parameter, is not evaluated.
Returns:
True.

Builder & Puma::Syntax::builder (  )  const [inline, protected]

Get the syntax tree builder.

Reimplemented in Puma::CCSyntax.

Semantic & Puma::Syntax::semantic (  )  const [inline, protected]

Get the semantic analysis object.

Reimplemented in Puma::CCSyntax.

CTree * Puma::Syntax::trans_unit (  )  [inline, protected, virtual]

Top parse rule to be reimplemented for a specific grammar.

Returns:
The root node of the syntax tree, or NULL.

Reimplemented in Puma::CSyntax.

void Puma::Syntax::handle_directive (  )  [inline, protected, virtual]

Handle a compiler directive token.

The default handling is to skip the compiler directive.

Reimplemented in Puma::CSyntax.

State Puma::Syntax::save_state (  )  [protected]

Save the current parser state.

Calls save_state() on the builder, semantic, and token provider objects.

Returns:
The current parser state.

void Puma::Syntax::forget_state (  )  [protected]

Forget the saved parser state.

void Puma::Syntax::restore_state (  )  [protected]

Restore the saved parser state.

Triggers restoring the syntax and semantic trees to the saved state.

void Puma::Syntax::restore_state ( State  state  )  [protected]

Restore the saved parser state to the given state.

Triggers restoring the syntax and semantic trees.

Parameters:
state The state to which to restore.

void Puma::Syntax::set_state ( State  state  )  [protected]

Overwrite the parser state with the given state.

Parameters:
state The new parser state.

bool Puma::Syntax::accept ( CTree tree,
State  state 
) [protected]

Accept the given syntax tree node.

If the node is NULL then the parser state is restored to the given state. Otherwise all saved states are discarded.

Parameters:
tree Tree to accept.
state The saved state.

Token* Puma::Syntax::locate_token (  )  [protected]

Skip all non-core language tokens until the next core-language token is read.

Returns:
The next core-language token.

void Puma::Syntax::skip (  )  [protected]

Skip the current token.

void Puma::Syntax::skip_block ( int  start,
int  end 
) [protected]

Skip all tokens between start and end, including start and end token.

Parameters:
start The start token type.
end The end token type.

void Puma::Syntax::skip_curly_block (  )  [protected]

Skip all tokens between '{' and '}', including '{' and '}'.

void Puma::Syntax::skip_round_block (  )  [protected]

Skip all tokens between '(' and ')', including '(' and ')'.

void Puma::Syntax::parse_block ( int  start,
int  end 
) [protected]

Parse all tokens between start and end, including start and end token.

Parameters:
start The start token type.
end The end token type.

void Puma::Syntax::parse_curly_block (  )  [protected]

Parse all tokens between '{' and '}', including '{' and '}'.

void Puma::Syntax::parse_round_block (  )  [protected]

Parse all tokens between '(' and ')', including '(' and ')'.

bool Puma::Syntax::skip ( int  stop_token,
bool  inclusive = true 
) [protected]

Skip all tokens until a token with the given type is read.

Parameters:
stop_token The type of the token to stop.
inclusive If true, the stop token is skipped too.
Returns:
False if the stop token is not found, true otherwise.

bool Puma::Syntax::skip ( int *  stop_tokens,
bool  inclusive = true 
) [protected]

Skip all tokens until a token with one of the given types is read.

Parameters:
stop_tokens The types of the token to stop.
inclusive If true, the stop token is skipped too.
Returns:
False if the stop token is not found, true otherwise.

bool Puma::Syntax::is_in ( int  token_type,
int *  token_types 
) const [protected]

Check if the given token type is in the set of given token types.

Parameters:
token_type The token type to check.
token_types The set of token types.


Member Data Documentation

Token provider for getting the tokens to parse.




Puma Reference Manual. Created on 5 Nov 2008.