Patterns | ![]() ![]() |
The patterns in the input are written using an extended set of regular expressions. These are:
Note that inside of a character class, all regular expression operators lose their special meaning except escape (\) and the character class operators, -, ], and, at the beginning of the class, ^.
The regular expressions listed above are grouped according to precedence, from highest precedence at the top to lowest at the bottom. Those grouped together have equal precedence. For example,
foo|bar*
is the same as:
(foo)|(ba(r*))
since the * operator has higher precedence than concatenation, and concatenation higher than alternation (|). This pattern therefore matches either the string foo or the string ba followed by zero-or-more r's. To match foo or zero-or-more bar's, use:
foo|(bar)*
and to match zero-or-more foo's-or-bar's:
(foo|bar)*
A negated character class such as the example [^A-Z] above will match a newline unless \n (or an equivalent escape sequence) is one of the characters explicitly present in the negated character class (e.g., [^A-Z\n]). This is unlike how many other regular expression tools treat negated character classes, but unfortunately the inconsistency is historically entrenched. Matching newlines means that a pattern like [^"]* can match the entire input unless there's another quote in the input.
A rule can have at most one instance of trailing context (the / operator or the $ operator). The start conditions, ^, and <<EOF>> patterns can only occur at the beginning of a pattern, and, as well as with / and $, cannot be grouped inside parentheses. A ^ which does not occur at the beginning of a rule or a $ which does not occur at the end of a rule loses its special properties and is treated as a normal character.
The following are illegal:
foo/bar$ <sc1>foo<sc2>bar
Note that the first of these, can be written foo/bar\n. The following will result in $ or ^ being treated as a normal character:
foo|(bar$) foo|^bar
If what's wanted is a foo or a bar-followed-by-a-newline, the following could be used (the special | action is explained in the Actions section):
foo | bar$ -- action goes here
A similar trick will work for matching a foo or a bar-at-the-beginning-of-a-line.
Copyright © 1998, Eric
Bezault mailto:ericb@gobosoft.com http://www.gobosoft.com Last Updated: 4 August 1998 |
![]() ![]() ![]() ![]() |