Name

man.string.subst.map — Specifies a set of string substitutions

Synopsis

<xsl:param name="man.string.subst.map">

  <!-- ******************************************************************** -->
  <!-- *  -->
  <!-- * The backslash and dot (\, .) characters have special meaning -->
  <!-- * for roff, so we escape those characters when they appear in -->
  <!-- * the source content, and we use certain (arbitrarily -->
  <!-- * selected) Unicode characters as the internal representations -->
  <!-- * for those characters, then replace them with real -->
  <!-- * backslashes and dots in output. In addition, for certain -->
  <!-- * reasons, we do the same thing for dashes. The mappings of -->
  <!-- * those characters to the Unicode characters we use to -->
  <!-- * represent them is hard-coded: -->
  <!-- *  -->
  <!-- *   - U+2591 = dash -->
  <!-- *   - U+2593 = backslash -->
  <!-- *   - U+2302 = dot -->
  <!-- *  -->
  <!-- ******************************************************************** -->

  <!-- * escape backslashes in content; we use "\e" instead of "\\" -->
  <!-- * because the groff docs say that's the correct thing to do; also -->
  <!-- * because testing shows that "\\" doesn't always work as expected; -->
  <!-- * for example, "\\" within a table seems to mess things up -->
  <substitution oldstring="\" newstring="\e"></substitution>
  <!-- * fix bad font-request capitalization in .SH (stylesheet artifact) -->
  <substitution oldstring="▓FB" newstring="\fB"></substitution>
  <substitution oldstring="▓FI" newstring="\fI"></substitution>
  <substitution oldstring="▓FR" newstring="\fR"></substitution>
  <!-- * remove no-break marker at beginning of line (stylesheet artifact) --> 
  <substitution oldstring="▒▀" newstring="▒"></substitution>
  <!-- * replace U+2580 no-break marker (stylesheet-added) w/ no-break space -->
  <substitution oldstring="▀" newstring="\ "></substitution>
  <!-- * replace U+2593 marker with backslash --> 
  <substitution oldstring="▓" newstring="\"></substitution>
  <!-- * escape dashes in content (only at line beginnings) -->
  <substitution oldstring="
." newstring="
\&."></substitution>
  <!-- * replace U+2302 marker with dot -->
  <substitution oldstring="⌂" newstring="."></substitution>
  <!-- * escape dashes in content -->
  <substitution oldstring="-" newstring="\-"></substitution>
  <!-- * replace U+2591 marker with dash -->
  <substitution oldstring="░" newstring="-"></substitution>

  <!-- ==================================================================== -->

  <!-- * squeeze multiple newlines before a roff request  -->
  <substitution oldstring="

." newstring="
."></substitution>
  <!-- * remove any .sp occurences that directly follow a .PP  -->
  <substitution oldstring=".PP
.sp" newstring=".PP"></substitution>
  <!-- * squeeze multiple newlines after start of no-fill (verbatim) env. -->
  <substitution oldstring=".nf

" newstring=".nf
"></substitution>
  <!-- * squeeze multiple newlines after REstoring margin -->
  <substitution oldstring=".RE

" newstring=".RE
"></substitution>
  <!-- * an apostrophe at the beginning of a line gets interpreted as a -->
  <!-- * roff request (groff(7) says it is "the non-breaking control -->
  <!-- * character"); so we must add backslash before any apostrophe -->
  <!-- * found at the start of a line -->
  <substitution oldstring="
'" newstring="
\'"></substitution>
  <!-- * -->
  <!-- * non-breaking space -->
  <!-- * -->
  <!-- * A no-break space can be written two ways in roff; the difference, -->
  <!-- * according to the "Page Motions" node in the groff info page, ixsl: -->
  <!-- * -->
  <!-- *   "\ " = -->
  <!-- *   An unbreakable and unpaddable (i.e. not expanded during filling) -->
  <!-- *   space. -->
  <!-- * -->
  <!-- *   "\~" = -->
  <!-- *   An unbreakable space that stretches like a normal -->
  <!-- *   inter-word space when a line is adjusted."  -->
  <!-- * -->
  <!-- * Unfortunately, roff seems to do some weird things with long -->
  <!-- * lines that only have words separated by "\~" spaces, so it's -->
  <!-- * safer just to stick with the "\ " space -->
  <substitution oldstring=" " newstring="\ "></substitution>
  <!-- * x2008 is a "punctuation space"; we must replace it here because, -->
  <!-- * for certain reasons, the stylesheets add it before and after -->
  <!-- * every Parameter in Funcprototype output -->
  <substitution oldstring=" " newstring=" "></substitution>
  <!-- * -->
  <!-- * Now deal with some other characters that are added by the -->
  <!-- * stylesheets during processing. -->
  <!-- * -->
  <!-- * bullet -->
  <substitution oldstring="•" newstring="\(bu"></substitution>
  <!-- * left double quote -->
  <substitution oldstring="“" newstring="\(lq"></substitution>
  <!-- * right double quote -->
  <substitution oldstring="”" newstring="\(rq"></substitution>
  <!-- * left single quote -->
  <substitution oldstring="‘" newstring="\(oq"></substitution>
  <!-- * right single quote -->
  <substitution oldstring="’" newstring="\(cq"></substitution>
  <!-- * copyright sign -->
  <substitution oldstring="©" newstring="\(co"></substitution>
  <!-- * registered sign -->
  <substitution oldstring="®" newstring="\(rg"></substitution>
  <!-- * servicemark... -->
  <!-- * There is no groff equivalent for it. -->
  <substitution oldstring="℠" newstring="(SM)"></substitution>
  <!-- * trademark... -->
  <!-- * We don't do "\(tm" because for console output, -->
  <!-- * groff just renders that as "tm"; that is: -->
  <!-- * -->
  <!-- *   Product&#x2122; -> Producttm -->
  <!-- * -->
  <!-- * So we just make it to "(TM)" instead; thus: -->
  <!-- * -->
  <!-- *   Product&#x2122; -> Product(TM) -->
  <substitution oldstring="™" newstring="(TM)"></substitution>

  <!-- ==================================================================== -->

  <!-- * we use U+2592 as a marker for the newline before output of <sbr>; -->
  <!-- * so we now need to replace U+2592 marker with a real newline -->
  <substitution oldstring="▒" newstring="
"></substitution>

</xsl:param>

Description

The man.string.subst.map parameter contains a map that specifies a set of string substitutions to perform over the entire roff source for each man page, either just before generating final man-page output (that is, before writing man-page files to disk) or, if the value of the man.charmap.enabled parameter is non-zero, before applying the roff character map.

You can use man.string.subst.map as a “lightweight” character map to perform “essential” substitutions -- that is, substitutions that are always performed, even if the value of the man.charmap.enabled parameter is zero. For example, you can use it to replace quotation marks or other special characters that are generated by the DocBook XSL stylesheets for a particular locale setting (as opposed to those characters that are actually in source XML documents), or to replace any special characters that may be automatically generated by a particular customization of the DocBook XSL stylesheets.

[Warning]

Do you not change value of the man.string.subst.map parameter unless you are sure what you are doing. First consider adding your string-substitution mappings to either or both of the following parameters:

By default, both of those parameters contain no string substitutions. They are intended as a means for you to specify your own local string-substitution mappings.

If you remove any of default mappings from the value of the man.string.subst.map parameter, you are likely to end up with broken output. And be very about adding anything to it; it’s used for doing string substitution over the entire roff source of each man page – it causes target strings to be replaced in roff requests and escapes, not just in the visible contents of the page.

Contents of the substitution map

The string-substitution map contains one or more substitution elements, each of which has two attributes:

oldstring
string to replace
newstring
string with which to replace oldstring

It may also include XML comments (that is, delimited with "<!--" and "-->").

About escaping and replacing backslash, dot, and dash

The backslash and dot (\, .) characters have special meaning for roff, so we:

  • escape backslashes and dots where they appear in the source content

  • use certain (arbitrarily selected) Unicode characters as “markers” – internal representations within the stylesheet – for backslashes and dots

  • replace the Unicode characters with real backslashes and dotes before output gets serialized

In addition, for certain reasons, we do the same thing for dashes.

The mappings of dash, backslash, and dot to the Unicode characters we use to represent them is hard-coded in the stylesheet:

  • U+2591 = dash

  • U+2593 = backslash

  • U+2302 = dot

Those Unicode characters were chosen on the assumption that they are never used in content intended for output to man pages. It would be possible to provide a way for configuring the mappings (using XSLT parameters), but doing so would make the stylesheet code much more verbose and harder to read.