[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
m4
This chapter describes the many of the differences between this
implementation of m4
, and of other implementations found under
UNIX, such as System V Release 3, Solaris, and BSD flavors.
In particular, it lists the known differences and extensions to
POSIX. However, the list is not necessarily comprehensive.
At the time of this writing, POSIX 2001 (also known as IEEE
Std 1003.1-2001) is the latest standard, although a new version of
POSIX is under development and includes several proposals for
modifying what m4
is required to do. The requirements for
m4
are shared between SUSv3 and POSIX, and
can be viewed at
http://www.opengroup.org/onlinepubs/000095399/utilities/m4.html.
16.1 Extensions in GNU M4 | ||
16.2 Facilities in System V m4 not in GNU m4 | Facilities in System V m4 not in GNU M4 | |
16.3 Other incompatibilities |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This version of m4
contains a few facilities that do not exist
in System V m4
. These extra facilities are all suppressed by
using the ‘-G’ command line option (see section Invoking m4), unless overridden by other command line options.
$n
notation for macro arguments, n can contain
several digits, while the System V m4
only accepts one digit.
This allows macros in GNU m4
to take any number of
arguments, and not only nine (see section Arguments to macros).
This means that define(`foo', `$11')
is ambiguous between
implementations. To portably choose between grabbing the first
parameter and appending 1 to the expansion, or grabbing the eleventh
parameter, you can do the following:
define(`a1', `A1') ⇒ dnl First argument, concatenated with 1 define(`_1', `$1')define(`first1', `_1($@)1') ⇒ dnl Eleventh argument, portable define(`_9', `$9')define(`eleventh', `_9(shift(shift($@)))') ⇒ dnl Eleventh argument, GNU style define(`Eleventh', `$11') ⇒ first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒A1 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒k Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒k |
Also see the argn
macro (see section Recursion in m4
).
divert
(see section Diverting output) macro can manage more than 9
diversions. GNU m4
treats all positive numbers as valid
diversions, rather than discarding diversions greater than 9.
include
and sinclude
are sought in a
user specified search path, if they are not found in the working
directory. The search path is specified by the ‘-I’ option and the
M4PATH
environment variable (see section Searching for include files).
undivert
can be non-numeric, in which case the named
file will be included uninterpreted in the output (see section Undiverting output).
format
builtin, which
is modeled after the C library function printf
(see section Formatting strings (printf-like)).
regexp
(see section Searching for regular expressions) and patsubst
(see section Substituting text by regular expression) builtins. Some BSD implementations use
extended regular expressions instead.
m4
with
esyscmd
(see section Reading the output of commands).
builtin
(see section Indirect call of builtins).
indir
(see section Indirect call of macros).
__program__
,
__file__
, and __line__
(see section Printing current location).
dumpdef
and macro tracing can be
controlled with debugmode
(see section Controlling debugging output).
debugfile
(see section Saving debugging output).
maketemp
(see section Making temporary files) macro behaves like mkstemp
,
creating a new file with a unique name on every invocation, rather than
following the insecure behavior of replacing the trailing ‘X’
characters with the m4
process id.
m4
, for a
description of these options.
The debugging and tracing facilities in GNU m4
are much
more extensive than in most other versions of m4
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
m4
not in GNU m4
The version of m4
from System V contains a few facilities that
have not been implemented in GNU m4
yet. Additionally,
POSIX requires some behaviors that GNU m4
has not
implemented yet. Relying on these behaviors is non-portable, as a
future release of GNU m4
may change.
defn
,
without any clarification on how defn
behaves when one of the
multiple arguments names a builtin. System V m4
and some other
implementations allow mixing builtins and text macros into a single
macro. GNU m4
only supports joining multiple text
arguments, although a future implementation may lift this restriction to
behave more like System V. The only portable way to join text macros
with builtins is via helper macros and implicit concatenation of macro
results.
eval
(see section Evaluating integer expressions) when an argument cannot be parsed).
m4
correctly handles multiple instances
of ‘-’ on the command line.
m4wrap
(see section Saving text until end of input) to act in FIFO
(first-in, first-out) order, but GNU m4
currently uses
LIFO order. Furthermore, POSIX states that only the first
argument to m4wrap
is saved for later evaluation, but
GNU m4
saves and processes all arguments, with output
separated by spaces.
a`'define`'b
would expand to ab
. But
GNU m4
ignores certain builtins if they have missing
arguments, giving adefineb
for the above example.
define(`f',`1')
(see section Defining a macro)
by undefining the entire stack of previous definitions, and if doing
undefine(`f')
first. GNU m4
replaces just the top
definition on the stack, as if doing popdef(`f')
followed by
pushdef(`f',`1')
. POSIX allows either behavior.
syscmd
(see section Executing simple commands) to evaluate
command output for macro expansion, but this was a mistake that is
anticipated to be corrected in the next version of POSIX.
GNU m4
follows traditional behavior in syscmd
where output is not rescanned, and provides the extension esyscmd
that does scan the output.
changequote(arg)
(see section Changing the quote characters) to use newline as the close quote, but this was a
bug, and the next version of POSIX is anticipated to state
that using empty strings or just one argument is unspecified.
Meanwhile, the GNU m4
behavior of treating an empty
end-quote delimiter as ‘'’ is not portable, as Solaris treats it as
repeating the start-quote delimiter, and BSD treats it as leaving the
previous end-quote delimiter unchanged. For predictable results, never
call changequote with just one argument, or with empty strings for
arguments.
changecom(arg,)
(see section Changing the comment delimiters) to make it impossible to end a comment, but this is
a bug, and the next version of POSIX is anticipated to state
that using empty strings is unspecified. Meanwhile, the GNU
m4
behavior of treating an empty end-comment delimiter as newline
is not portable, as BSD treats it as leaving the previous end-comment
delimiter unchanged. It is also impossible in BSD implementations to
disable comments, even though that is required by POSIX. For
predictable results, never call changecom with empty strings for
arguments.
m4
give macros a higher precedence than
comments when parsing, meaning that if the start delimiter given to
changecom
(see section Changing the comment delimiters) starts with a macro name, comments
are effectively disabled. POSIX does not specify what the
precedence is, so this version of GNU m4
parser
recognizes comments, then macros, then quoted strings.
m4
, but
gives an error message that the end of file was encountered inside a
macro with GNU m4
. On the other hand, traditional
implementations do end of file processing for files included with
include
or sinclude
(see section Including named files), while GNU
m4
seamlessly integrates the content of those files. Thus
include(`a.m4')include(`b.m4')
will output ‘3’ instead of
giving an error.
m4
treats traceon
(see section Tracing macro calls) without
arguments as a global variable, independent of named macro tracing.
Also, once a macro is undefined, named tracing of that macro is lost.
On the other hand, when GNU m4
encounters
traceon
without
arguments, it turns tracing on for all existing definitions at the time,
but does not trace future definitions; traceoff
without arguments
turns tracing off for all definitions regardless of whether they were
also traced by name; and tracing by name, such as with ‘-tfoo’ at
the command line or traceon(`foo')
in the input, is an attribute
that is preserved even if the macro is currently undefined.
Additionally, while POSIX requires trace output, it makes no
demands on the formatting of that output. Parsing trace output is not
guaranteed to be reliable, even between different releases of
GNU M4; however, the intent is that any future changes in
trace output will only occur under the direction of additional
debugmode
flags (see section Controlling debugging output).
eval
(see section Evaluating integer expressions) to treat all
operators with the same precedence as C. However, earlier versions of
GNU m4
followed the traditional behavior of other
m4
implementations, where bitwise and logical negation (‘~’
and ‘!’) have lower precedence than equality operators; and where
equality operators (‘==’ and ‘!=’) had the same precedence as
relational operators (such as ‘<’). Use explicit parentheses to
ensure proper precedence. As extensions to POSIX,
GNU m4
gives well-defined semantics to operations that
C leaves undefined, such as when overflow occurs, when shifting negative
numbers, or when performing division by zero. POSIX also
requires ‘=’ to cause an error, but many traditional
implementations allowed it as an alias for ‘==’.
translit
(see section Translating characters) to
treat each character of the second and third arguments literally.
However, it is anticipated that the next version of POSIX will
allow the GNU m4
behavior of treating ‘-’ as a
range operator.
m4
to honor the locale environment
variables of LANG
, LC_ALL
, LC_CTYPE
,
LC_MESSAGES
, and NLSPATH
, but this has not yet been
implemented in GNU m4
.
m4
follows
tradition and ignores all leading unquoted whitespace.
POSIXLY_CORRECT
and enables the option
--gnu
by default (see section Invoking m4), a
client desiring to be strictly compliant has no way to disable
GNU extensions that conflict with POSIX when
directly invoking the compiled m4
. A future version of
GNU
M4 will honor the environment variable POSIXLY_CORRECT
,
implicitly enabling ‘--traditional’ if it is set, in order to
allow a strictly-compliant client. In the meantime, a client needing
strict POSIX compliance can use the workaround of invoking a
shell script wrapper, where the wrapper then adds ‘--traditional’
to the arguments passed to the compiled m4
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are a few other incompatibilities between this implementation of
m4
, and the System V version.
m4
implements sync lines differently from System V
m4
, when text is being diverted. GNU m4
outputs
the sync lines when the text is being diverted, and System V m4
when the diverted text is being brought back.
The problem is which lines and file names should be attached to text
that is being, or has been, diverted. System V m4
regards all
the diverted text as being generated by the source line containing the
undivert
call, whereas GNU m4
regards the
diverted text as being generated at the time it is diverted.
The sync line option is used mostly when using m4
as
a front end to a compiler. If a diverted line causes a compiler error,
the error messages should most probably refer to the place where the
diversion was made, and not where it was inserted again.
divert(2)2 divert(1)1 divert`'0 ⇒#line 3 "stdin" ⇒0 ^D ⇒#line 2 "stdin" ⇒1 ⇒#line 1 "stdin" ⇒2 |
The current m4
implementation has a limitation that the syncline
output at the start of each diversion occurs no matter what, even if the
previous diversion did not end with a newline. This goes contrary to
the claim that synclines appear on a line by themselves, so this
limitation may be corrected in a future version of m4
. In the
meantime, when using ‘-s’, it is wisest to make sure all
diversions end with newline.
m4
makes no attempt at prohibiting self-referential
definitions like:
define(`x', `x') ⇒ define(`x', `x ') ⇒ |
There is nothing inherently wrong with defining ‘x’ to
return ‘x’. The wrong thing is to expand ‘x’ unquoted,
because that would cause an infinite rescan loop.
In m4
, one might use macros to hold strings, as we do for
variables in other programming languages, further checking them with:
ifelse(defn(`holder'), `value', …) |
In cases like this one, an interdiction for a macro to hold its own name
would be a useless limitation. Of course, this leaves more rope for the
GNU m4
user to hang himself! Rescanning hangs may be
avoided through careful programming, a little like for endless loops in
traditional programming languages.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by root on March 13, 2013 using texi2html 1.82.