Previous Up Next

8  Generating HTML constructs

HEVEA output language being HTML, it is normal for users to insert hypertext constructs their documents, or to control colors.

8.1  High-Level Commands

HEVEA provides high-level commands for doing this. Users are advised to use these macros in the first place, because it is easy to write incorrect HTML and that writing HTML directly may interfere in nasty ways with HEVEA internals.

8.1.1  Commands for Hyperlinks

A few commands for hyperlink management and included images are provided, all these commands have appropriate equivalents defined by the hevea package (see section 5.2). Hence, a document that relies on these high-level commands still can be typeset by LATEX, provided it loads the hevea package.

MacroHEVEALATEX
\ahref{url}{text}    make text an hyperlink to url    echo text
\footahref{url}{text}    make text an hyperlink to url    make url a footnote to text, url is shown in typewriter font
\ahrefurl{url}    make url an hyperlink to url.    typeset url in typewriter font
\ahrefloc{label}{text}    make text an hyperlink to label inside the document    echo text
\aname{label}{text}    make text an hyperlink target with label label    echo text
\mailto{address}    make address a “mailto” link to address    typeset address in typewriter font
\imgsrc[attr]{url}    insert url as an image, attr are attributes in the HTML sense    do nothing
\home{text}    produce a home-dir url both for output and links, output aspect is: “~text

It is important to notice that all arguments are processed. For instance, to insert a link to my home page, (http://pauillac.inria.fr/~maranget/index.html), you should do something like this:

\ahref{http://pauillac.inria.fr/\home{maranget}/index.html}{his home page}

Given the frequency of ~, # etc. in urls, this is annoying. Moreover, the immediate solution, using \verb, \ahref{\verb" ... /~maranget/..."}{his home page} does not work, since LATEX forbids verbatim formatting inside command arguments.

Fortunately, the url package provides a very convenient \url command that acts like \verb and can appear in other command arguments (unfortunately, this is not the full story, see section B.17.10). Hence, provided the url package is loaded, a more convenient reformulation of the example above is:

\ahref{\url{http://pauillac.inria.fr/~maranget/index.html}}{his home page}

Or even better:

\urldef{\lucpage}{\url}{http://pauillac.inria.fr/~maranget/index.html}
\ahref{\lucpage}{his home page}

It may seem complicated, but this is a safe way to have a document processed both by LATEX and HEVEA. Drawing a line between url typesetting and hyperlinks is correct, because users may sometime want urls to be processed and some other times not. Moreover, HEVEA (optionally) depends on only one third party package: url, which as correct as it can be and well-written.

In case the \url command is undefined at the time \begin{document} is processed, the commands \url, \oneurl and \footurl are defined as synonymous for \ahref, \ahrefurl and \footahref, thereby ensuring some compatibility with older versions of HEVEA. Note that this usage of \url is deprecated.

8.1.2  HTML style colors

Specifying colors both for LATEX and HEVEA should be done using the color package (see section B.14.2). However,one can also specify text color using special type style declarations. The hevea.sty style file define no equivalent for these declarations, which therefore are for HEVEA consumption only.

Those declarations follow HTML conventions for colors. There are sixteen predefined colors:

\black, \silver, \gray, \white, \maroon, \red, \fuchsia, \purple, \green, \lime, \olive, \yellow, \navy, \blue, \teal, \aqua

Additionally, the current text color can be changed by the declaration \htmlcolor{number}, where number is a six digit hexadecimal number specifying a color in the RGB space. For instance, the declaration \htmlcolor{404040} changes font color to dark gray,

8.2  More on included images

The \imgsrc command becomes handy when one has images both in Postscript and GIF format. As explained in section 6.3, Postscript images can be included in LATEX documents by using the \epsfbox command from the epsf package. For instance, if screenshot.ps is an encapsulated Postscript file, then a doc.tex document can include it by:

\epsfbox{screenshot.ps}

We may very well also have a GIF version of the screenshot image (or be able to produce one easily using image converting tools), let us store it in a screenshot.ps.gif file. Then, for HEVEA to include a link to the GIF image in its output, it suffices to define the \epsfbox command in the macro.hva file as follows:

\newcommand{\epsfbox}[1]{\imgsrc{#1.gif}}

Then HEVEA has to be run as:

# hevea macros.hva doc.tex

Since it has its own definition of \epsfbox, HEVEA will silently include a link the GIF image and not to the Postscript image.

If another naming scheme for image files is preferred, there are alternatives. For instance, assume that Postscript files are of the kind name.ps, while GIF files are of the kind name.gif. Then, images can be included using \includeimage{name}, where \includeimage is a specific user-defined command:

\newcommand{\includeimage}[1]{\ifhevea\imgsrc{#1.gif}\else\epsfbox{#1.ps}\fi}

Note that this method uses the hevea boolean register (see section 5.2.3). If one does not wish to load the hevea.sty file, one can adopt the slightly more verbose definition:

\newcommand{\includeimage}[1]{%
%HEVEA\imgsrc{#1.gif}%
%BEGIN LATEX
\epsfbox{#1.ps}
%END LATEX
}

When the Postscript file has been produced by translating a bitmap file, this simple method of making a GIF image and using the \imgsrc command is the most adequate. It should be preferred over using the more automated image file mechanism (see section 6), which will translate the image back from Postscript to bitmap format and will thus degrade it.

8.3  Internal macros

In this section a few of HEVEA internal macros are described. Internal macros occur at the final expansion stage of HEVEA and invoke Objective Caml code.

Normally, user source code should not use them, since their behavior may change from one version of HEVEA to another and because using them incorrectly easily crashes HEVEA. However:

The general principle of HEVEA is that LATEX environments \begin{env}\end{env} get translated into HTML block-level elements <block attributes></block>. More specifically, such block level elements are opened by the internal macro \@open and closed by the internal macro \@close. As a special case, LATEX groups {} get translated into HTML groups, which are shadow block-level elements with neither opening nor closing tag. In the following few paragraph, we sketch the interaction of \@open\@close with paragraphs and display. Doing so, we intend to warn users about the complexity of the task of producing correct HTML, and to encourage them to use internal macros, which, most of the time, take nasty details into account.

Paragraphs are rendered by P elements, which are opened and closed automatically. More specifically, a first P is opened after \begin{document}, then paragraph breaks close the active P and open a new one. The final \end{document} closes the last P. In any occasion, paragraphs consisting only of space characters are discarded silently.

Following HTML “normative reference [HTML-4.0]”, block-level elements cannot occur inside P; more precisely, block-level opening tags implicitely close any active P. As a consequence, HEVEA closes the active P element when it processes \@open and opens a new P when it processes the matching \@close. Generally, no P element is opened by default inside block-level elements, that is, HEVEA does not immediately open P after having processed \@open. However, if a paragraph break occurs later, then a new P element is opened, and will be closed automatically when the current block is closed. Thus, the first “paragraph” inside block-level elements that include several paragraphs is not a P element. That alone probably prevents the consistent styling of paragraphs with style sheets.

Groups behave differently, opening or closing them does not close nor open P elements. However, processing paragraph breaks inside groups involves temporarily closing all groups up to the nearest enclosing P, closing it, opening a new P and finally re-opening all groups. Opening a block-level element inside a group, similarily involves closing the active P and opening a new P when the matching \@close is processed.

Finally, display mode (as introduced by $$) is also complicated. Displays basically are TABLE elements with one row (TR), and HEVEA manages to introduce table cells (TD) where appropriate. Processing \@open inside a display means closing the current cell, starting a new cell, opening the specified block, and then immediately opening a new display. Processing the matching \@close closes the internal display, then the specified block, then the cell and finally opens a new cell. In many occasions (in particular for groups), either cell break or the internal display may get cancelled.

It is important to notice that primitive arguments are processed (except for the \@print primitive, and for some of the basic style primitives). Thus, some characters cannot be given directly (e.g. # and % must be given as \# and \%).

\@print{text}
Echo text verbatim. As a consequence use only ascii in text.
\@getprint{text}
Process text using a special output mode that strips off HTML tags. This macro is the one to use for processed attributes of HTML tags.
\@hr[attr]{width}{height}
Output an HTML horizontal rule, attr is attributes given directly (e.g. SIZE=3 HOSHADE), while width and height are length arguments given in the LATEX style (e.g. 2pt or .5\linewidth).
\@print@u{n}
Output the (Unicode) character “n”, which can be given either as a decimal number or an hexadecimal number prefixed by “X”.
\@open{BLOCK}{attributes}
Open HTML block-level element BLOCK with attributes attributes. The block name BLOCK must be uppercase. As a special case BLOCK may be the empty string, then a HTML group is opened.
\@close{BLOCK}
Close HTML block-level element BLOCK. Note that \@open and \@close must be properly balanced.
\@out@par{arg}
If occuring inside a P element, that is if a <P> opening tag is active, \@out@par first closes it (by emitting </P>), then formats arg, and then re-open a P element. Otherwise \@out@par simply formats arg. This command is adequate when formatting arg produces block-level elements. Besides text-level elements should be managed properly (see below).

Text-level elements are managed differently. They are not seen as blocks that must be closed explicitly and they are replaced by the internal text-level declarations \@style (and \@styleattr), \@fontsize and \@fontcolor. Block-level elements (and HTML groups) delimit the effect of such declarations.

\@style{SHAPE}
Declare the text shape SHAPE (which must be uppercase) as active. Text shapes are known as font style elements (I, TT, etc.) or phrase elements (EM, etc.) in the HTML terminology, they are part of the more general class of text-level elements.

The text-level element SHAPE will get opened as soon as necessary and closed automatically, when the enclosing block-level elements get closed. Enclosed block-level elements are treated properly by closing SHAPE before them, and re-opening SHAPE inside them. The next text-level constructs exhibit similar behavior with respect to block-level elements.

\@styleattr{NAME}{attr}
Declare the text-level element NAME with attribute attr active. This primitive behaves as \@style, except that the opening tag has attributes. This primitive may prove useful for introducing SPAN elements. Note that both argument are processed.
\@span{attr}
A shorthand for \@styleattr{SPAN}{attr}.
\@fontsize{int}
Declare the text-level element FONT with attribute SIZE=int as active. Note that int must be a small integer in the range 1,2, … , 7.
\@fontcolor{color}
Declare the text-level element FONT with attribute COLOR=color as active. Note that color must be a color attribute value in the HTML style. That is either one of the sixteen conventional colors black, silver etc, or a RGB hexadecimal color specification of the form "#XXXXXX" (yes, quotes are needed). Note that the argument color is processed, as a consequence numerical color arguments should be given as "\#XXXXXX".
\@nostyle
Close active text-level declarations and ignore further text-level declarations. The effect stops when the enclosing block-level element is closed.
\@clearstyle
Simply close active text-level declarations.

8.4  The rawhtml environment

Any text enclosed between \begin{rawhtml} and \end{rawhtml} is echoed verbatim into the HTML output file. Similarly, \rawhtmlinput{file} echoes the contents of file file. In fact, rawhtml is the environment counterpart of the \@print command, but experience showed it to be much more error prone.

When HEVEA was less sophisticated then it is now, rawhtml was quite convenient. But, as time went by, numerous pitfalls around rawhtml showed up. Here are a few:

As a conclusion, do not use the rawhtml environment! A much safer option is to use the htmlonly environment and to write LATEX code. For instance, in place of writing:

\begin{rawhtml}
A list of links:
<UL>
<LI><A HREF="http://www.apple.com/">Apple</A>.
<LI><A HREF="http://www.sun.com/">Sun</A>.
</UL>
\end{rawhtml}

One can write:

\begin{htmlonly}
A list of links:
\begin{itemize}
\item \ahref{http://www.apple.com/}{Apple}.
\item \ahref{http://www.sun.com/}{Sun}.
\end{itemize}
\end{htmlonly}

A list of links:

If HEVEA is targeted to text or info files (see Section 11). The text inside rawhtml environments is ignored. However there exists a rawtext environment (and a \rawtextinput command) to echo text verbatim in text or info output mode. Additionally, the raw environment and a \rawinput command echo their contents verbatim, regardless of HEVEA output mode. Of course, when HEVEA produces HTML, the latter environement and command suffer from the same drawbacks as rawhtml.

8.5  Examples

As a first example of using internal macros, consider the following excerpt from the hevea.hva file that defines the center environment:

\newenvironment{center}{\@open{DIV}{ALIGN=center}}{\@close{DIV}}

Notice that the code above is no longer present and is given here for explanatory purpose only. Now HEVEA uses style-sheets and the actual definition of the center environment is as follows:

\newstyle{.center}{text-align:center;margin-left:auto;margin-right:auto;}%
\setenvclass{center}{center}%
\newenvironment{center}
  {\@open{DIV}{\@getprint{CLASS="\getenvclass{center}"}}
  {\@close{DIV}}%

Basically environments \begin{center}\end{center} will, by default, be translated into blocks <DIV CLASS="center"></DIV>. Additionally, the style class associated to center environments is managed through an indirection, using the commands \setenvclass and \getenvclass. See section 9.3 for more explanations.

Another example is the definition of the \purple color declaration (see section 8.1.2):

\newcommand{\purple}{\@fontcolor{purple}}

HEVEA does not feature all text-level elements by default. However one can easily use them with the internal macro \@style. For instance this is how you can make all emphasized text blink:

\renewcommand{\em}{\@style{EM}\@style{BLINK}}

Then, here is the definition of a simplified \imgsrc command (see section 8.1.1), without its optional argument:

\newcommand{\imgsrc}[1]
  {\@print{<IMG SRC="}\@getprint{#1}\@print{">}}

Here, \@print and \@getprint are used to output HTML text, depending upon whether this text requires processing or not. Note that \@open{IMG}{SRC="#1"} is not correct, because the element IMG consists in a single tag, without a closing tag.

Another interesting example is the definition of the command \@doaelement, which HEVEA uses internally to output A elements.

\newcommand{\@doaelement}[2]
  {{\@nostyle\@print{<A }\@getprint{#1}\@print{>}}{#2}{\@nostyle\@print{</A>}}

The command \@doaelement takes two arguments: the first argument contains the opening tag attributes; while the second element is the textual content of the A element. By contrast with the \imgsrc example above, tags are emitted inside groups where styles are canceled by using the \@nostyle declaration. Such a complication is needed, so as to avoid breaking proper nesting of text-level elements.

Here is another example of direct block opening. The bgcolor environment from the color package locally changes background color (see section B.14.2.1). This environment is defined as follows:

\newenvironment{bgcolor}[2][CELLPADDING=10]
  {\@open{TABLE}{#1}\@open{TR}{}\@open{TD}{BGCOLOR=\@getcolor{#2}}}
  {\@close{TD}\@close{TR}\@close{TABLE}}

The bgcolor environment operates by opening a HTML table (TABLE) with only one row (TR) and cell (TD) in its opening command, and closing all these elements in its closing command. In my opinion, such a style of opening block-level elements in environment opening commands and closing them in environment closing commands is good style. The one cell background color is forced with a BGCOLOR attribute. Note that the mandatory argument to \begin{bgcolor} is the background color expressed as a high-level color, which therefore needs to be translated into a low-level color by using the \@getcolor internal macro from the color package. Additionally, \begin{bgcolor} takes HTML attributes as an optional argument. These attributes are the ones of the TABLE element.

If you wish to output a given unicode character whose value you know, the recommended technique is to define an ad-hoc command that simply call the \@print@u command. For instance, “blackboard sigma” is Unicode U02140 (hexa). Hence you can define the command \bbsigma as follows:

\newcommand{\bbsigma}{\@print@u{X2140}}

Then, “\bbsigma” will output “⅀”

8.6  The document charset

According to standards, as I understand them, HTML pages are made of Unicode (ISO 10646) characters. By constrast, a file in any operating system is usually considerered as being made of bytes.

To account for that fact, HTML pages usually specify a document charset that defines a translation from a flow of bytes to a flow of characters. HEVEA easily manage 8 bits encodings that specify an interpretation of every byte as a character8. For instance, the byte 0xA4 means Unicode 0x00A4 (¤) in the ISO-8859-1 (or latin1) encoding, and 0x20AC (€) in the ISO-8859-15 (or latin9) encoding. Notice that HEVEA has no difficulty to output both symbols, in fact they are defined as unicode characters:

\newcommand{\textcurrency}{\@print@u{XA4}}
\newcommand{\texteuro}{\@print@u{X20AC}}

But the \@print@u command may output the specified character as a byte, when possible, by the means of the output translator. If not possible, \@print@u outputs a numerical character references (for instance &#X20AC;).

Of course, the document charset and the output translator must be synchronized. The command \@def@charset takes a charset name as argument and performs the operation of specifying the document character set and the output translator. It should occur in the document preamble. Valid charset names are ISO-8859-n where n is a number in 115, US-ASCII (the default), windows-n where n is 1250, 1252 or 1257, or macintosh. In case those charsets do not suffice, you may ask the author for other document charsets. Notice however that document charset is not that important, the default US-ASCII works everywhere!

If wished so, the charset can be extracted from the current locale environment, provided this yields a valid (to HEVEA) charset name. This operation is performed by a companion script: xxcharset.exe. It thus suffices to launch HEVEA as:

# hevea -exec xxcharset.exe other arguments

8
Provided these encodings map ascii to ascii.

Previous Up Next