1 sdc -- an extensible SGML formatter

1.1 SYNOPSIS

sdc -o outputfile [ -O format ] [ -L directory ] [ -i directory ] [ -m file ] [ -R startup file ] [ -V level ] files ...

1.2 DESCRIPTION

sdc is an extensible formatter for documents. It transforms documents using SGML markup into various target formats.

sdc comes with a couple of document type definitions (DTD's).

The DTD's feature the reuse of text, minimization of markup and readability of the SGML source. They share their elements as much as possible.

The formatting differs due to the features possible in the target format and to the rules common for the type of the document. This includes the automated rearrangment of text and insertion of standard parts like contents sections, sorted index and bibliography. The latter for instance is composed from the items of a database which are referenced in the document. For some formats the output may be spread over a couple of files. See the target type documentation for details.

According to the goal of text reuse and the aim to support many target formats, these DTD's don't attempt to cover each and every case possible. Instead, they try to provide all elements nessesary for daily use and leave the implementation of special features to extensions.

It is also possible to have parts of the documents using other notations. E.g., pictures drawn with tgif, xfig, the @Fig package of Lout or encapsulated postscript.

It is fairly easy to coerce sdc to parse documents with other DTD's. But this implies to write rules for formatting in the desired target format(s), or fit in another parsing stage which changes it into a form as if it was marked acording to a supported DTD.

The transformation (formating) is described by files of scheme code related to both, the document type and the target format. Only combinations of common value are supported by default. (For instance for letters only PostSript output is defined.)

1.3 OPTIONS

-o filename
Set the name for the output file. If omitted or set to - the output goes to the standard output. (This can cause problems with some target formats if they split the document.)

Some target formats (e. g., HTML) hard wire the given name of the output file into the file(s) created (for cross referencing). Though it's wise not to give a path including a directory but only a file name for the output file.

-O type
Set the target format. type can be:

ps
create a PostScript document.
latex
create a LaTeX file.
html
create a HTML page.
info
create an Info file.
man
create a man page.
literate
create source files (literate programming)
rtf
(only partially supported) create a RTF file.
slide
create a PostScript file holding the slides from the document.

If the -O switch is omitted a guess is made from the extension of the output file name. If neither gives a target type this is an error.

-D directory
Add directory in front of the path searched for entities (files) of the documents. Each option can add only one directory. Multiple options are processed left to right, i. e., the last directory at the command line is searched first.
-i entityname
Ensure, that a definition like

<!ENTITY % entityname "INCLUDE" >

precedes the processing of the documents. This is useful for optional including of marked sections. Refer to the manual sgmls(1) for a detailed description. This option is passed to sgmls.

-m file
Extend the list of catalog files to search for some SGML entities. Refer to the manual sgmls(1) for a detailed description. This option is passed to sgmls.
-L dirname
Set the name of the directory to use as library of files to search for target format descriptions.
-R file
Set a startup file to load after the default ~/.typesetrc. Multiple -R options are allowed and processed in the given order. The files argument is treated to be either a path name to the file or one relative to the rc directory of a directory in the library (see -L).

Startup files can have their own arguments. If the argument given with a -R option contains a colon, only the half up to that colon gives the file name to be loaded. The rest of the argument (without the colon) is assigned to the variable *-R-option-argument* while the specified file is loaded. If there was no colon in the argument, #f is assigned.

-V level
Be verbose and don't delete temporary files (for debugging). Level must be a number. The default for level is 1. This will give only warnings and (for a historical reason) a message upon success. Higher values give more messages.

With the -R option there are additional (long) options available to change the over all behavior. These are used by supplying one or more of the following file names to the -R option.

nidx
Pretend having the NIDX token in the face attribute.
1c, 2c, 1s, 2s
Similar to nidx modify the value in effect of the face attribute in the top level document.
no-margin
No page margin in the (ASCII) output. Only implemented for lout processing at the moment.
HTML2
Don't use HTML-3 features in formatting.
linuxdoc
Work around some hard wired assumptions of the linuxdoc DTD when processing.

Attention! Be careful to supply the exact name to the -R option. The same policy as for dot files applies to those files: if they don't exist they are silently not loaded! There is no warning message.

1.4 RETURN CODE

sdc returns an exist value of 0 upon successful SGML parsing and 1 if the input does not conform to the DTD. Semantic errors like unresolved references are currently not covered.

1.5 EXAMPLES

Set the environment variable DOCPATH to a useful value like:

setenv DOCPATH $HOME/text/

Call sdc to generate a html page from text.sgml:

sdc -o text.html text.sgml

1.6 ENVIRONMENT

sdc recognizes the following environment variables:

DOCPATH
This path is used to find the entities of the document. It gets extended (at the end) by sdc to include the files of it's own. Also directories give by a -D option are prepended.

Usually a good value for DOCPATH is something like $HOME or $HOME/text:$HOME/doc.

SGML_CATALOG_FILES
The files mentioned by this variable are consulted by the underlying parser to find some SGML entities. For a detailed description refer to the the manual sgmls(1) . This variable gets extended by sdc to include one file of its own, the first file named CATALOG found in the library. As for sgmls the value can be extended by the -m option, which is simply passed to sgmls.

Usually it's good to leave this variable alone.

TYPESETLIB
This variable is used by sdc to find the directories to search for formatting translation files and the DTD's and CATALOG files for the underlying SGML parser. It may point to one directory or a list of directories separated by colons. This value can be overwritten by the -L option.

Usually it's good to leave this variable alone, except if you want to overwrite some but not all files of the library.

1.7 FILES

personal.data
is used by the DTD's which come with sdc to find definitions for the SGML entities related to the author. These two are myself and my-Inst. It may define some more. But these are used to insert default values for the name and the institution of the author. Therefore it's a good idea to set the environment variable DOCPATH so sdc will find this file. An example how to set up the content of this file comes with sdc.
~/.typesetrc
if any, is loaded after startup and comand line evaluation. It might contain any scheme code.

sdc uses the files and directory structure in its library to parse the document and determine the formatting. For a descrition of this refer to the developers documentation.

Furthermore sdc uses the current working directory when writing temporary files. (See the developers documentation for reasoning.) Make sure to change into a writable directory before invoking sdc.

1.8 CONFORMING TO

The underlying sgmls conforms to ISO 8879. Therefore sdc too.

1.9 NOTES

At the moment there is no complete developers documentation available. Support for RTF output is still far from complete and uses the old mechanism.

1.10 DIAGNOSTICS

An error

SGML error at file, line number

comes from the underlying SGML parser and indicates text not conforming to the document type definition. Usually this will imply a lot of other errors.

*** ERROR:bigloo:eval:

Indicates a bug in the format translation files.

1.11 AUTHOR

This was written by Jörg Wittenberger.

1.12 SEE ALSO

sgmls(1) , lout(1) , info(5) , latex(1) , xfig(1) .