org.ccil.cowan.tagsoup

Class XMLWriter

Implemented Interfaces:
LexicalHandler

public class XMLWriter
extends XMLFilterImpl
implements LexicalHandler

Filter to write an XML document from a SAX event stream.

This class can be used by itself or as part of a SAX event stream: it takes as input a series of SAX2 ContentHandler events and uses the information in those events to write an XML document. Since this class is a filter, it can also pass the events on down a filter chain for further processing (you can use the XMLWriter to take a snapshot of the current state at any point in a filter chain), and it can be used directly as a ContentHandler for a SAX2 XMLReader.

The client creates a document by invoking the methods for standard SAX2 events, always beginning with the startDocument method and ending with the endDocument method. There are convenience methods provided so that clients to not have to create empty attribute lists or provide empty strings as parameters; for example, the method invocation

 w.startElement("foo");
 

is equivalent to the regular SAX2 ContentHandler method

 w.startElement("", "foo", "", new AttributesImpl());
 

Except that it is more efficient because it does not allocate a new empty attribute list each time. The following code will send a simple XML document to standard output:

 XMLWriter w = new XMLWriter();

 w.startDocument();
 w.startElement("greeting");
 w.characters("Hello, world!");
 w.endElement("greeting");
 w.endDocument();
 

The resulting document will look like this:

 <?xml version="1.0" standalone="yes"?>

 <greeting>Hello, world!</greeting>
 

In fact, there is an even simpler convenience method, dataElement, designed for writing elements that contain only character data, so the code to generate the document could be shortened to

 XMLWriter w = new XMLWriter();

 w.startDocument();
 w.dataElement("greeting", "Hello, world!");
 w.endDocument();
 

Whitespace

According to the XML Recommendation, all whitespace in an XML document is potentially significant to an application, so this class never adds newlines or indentation. If you insert three elements in a row, as in

 w.dataElement("item", "1");
 w.dataElement("item", "2");
 w.dataElement("item", "3");
 

you will end up with

 <item>1</item><item>3</item><item>3</item>
 

You need to invoke one of the characters methods explicitly to add newlines or indentation. Alternatively, you can use DataWriter, which is derived from this class -- it is optimized for writing purely data-oriented (or field-oriented) XML, and does automatic linebreaks and indentation (but does not support mixed content properly).

Namespace Support

The writer contains extensive support for XML Namespaces, so that a client application does not have to keep track of prefixes and supply xmlns attributes. By default, the XML writer will generate Namespace declarations in the form _NS1, _NS2, etc., wherever they are needed, as in the following example:

 w.startDocument();
 w.emptyElement("http://www.foo.com/ns/", "foo");
 w.endDocument();
 

The resulting document will look like this:

 <?xml version="1.0" standalone="yes"?>

 <_NS1:foo xmlns:_NS1="http://www.foo.com/ns/"/>
 

In many cases, document authors will prefer to choose their own prefixes rather than using the (ugly) default names. The XML writer allows two methods for selecting prefixes:

  1. the qualified name
  2. the setPrefix method.

Whenever the XML writer finds a new Namespace URI, it checks to see if a qualified (prefixed) name is also available; if so it attempts to use the name's prefix (as long as the prefix is not already in use for another Namespace URI).

Before writing a document, the client can also pre-map a prefix to a Namespace URI with the setPrefix method:

 w.setPrefix("http://www.foo.com/ns/", "foo");
 w.startDocument();
 w.emptyElement("http://www.foo.com/ns/", "foo");
 w.endDocument();
 

The resulting document will look like this:

 <?xml version="1.0" standalone="yes"?>

 <foo:foo xmlns:foo="http://www.foo.com/ns/"/>
 

The default Namespace simply uses an empty string as the prefix:

 w.setPrefix("http://www.foo.com/ns/", "");
 w.startDocument();
 w.emptyElement("http://www.foo.com/ns/", "foo");
 w.endDocument();
 

The resulting document will look like this:

 <?xml version="1.0" standalone="yes"?>

 <foo xmlns="http://www.foo.com/ns/"/>
 

By default, the XML writer will not declare a Namespace until it is actually used. Sometimes, this approach will create a large number of Namespace declarations, as in the following example:

 <xml version="1.0" standalone="yes"?>

 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description about="http://www.foo.com/ids/books/12345">
   <dc:title xmlns:dc="http://www.purl.org/dc/">A Dark Night</dc:title>
   <dc:creator xmlns:dc="http://www.purl.org/dc/">Jane Smith</dc:title>
   <dc:date xmlns:dc="http://www.purl.org/dc/">2000-09-09</dc:title>
  </rdf:Description>
 </rdf:RDF>
 

The "rdf" prefix is declared only once, because the RDF Namespace is used by the root element and can be inherited by all of its descendants; the "dc" prefix, on the other hand, is declared three times, because no higher element uses the Namespace. To solve this problem, you can instruct the XML writer to predeclare Namespaces on the root element even if they are not used there:

 w.forceNSDecl("http://www.purl.org/dc/");
 

Now, the "dc" prefix will be declared on the root element even though it's not needed there, and can be inherited by its descendants:

 <xml version="1.0" standalone="yes"?>

 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dc="http://www.purl.org/dc/">
  <rdf:Description about="http://www.foo.com/ids/books/12345">
   <dc:title>A Dark Night</dc:title>
   <dc:creator>Jane Smith</dc:title>
   <dc:date>2000-09-09</dc:title>
  </rdf:Description>
 </rdf:RDF>
 

This approach is also useful for declaring Namespace prefixes that be used by qualified names appearing in attribute values or character data.

Version:
0.2
Author:
David Megginson, david@megginson.com
See Also:
org.xml.sax.XMLFilter, org.xml.sax.ContentHandler

Field Summary

static String
CDATA_SECTION_ELEMENTS
static String
DOCTYPE_PUBLIC
static String
DOCTYPE_SYSTEM
static String
ENCODING
static String
INDENT
static String
MEDIA_TYPE
static String
METHOD
static String
OMIT_XML_DECLARATION
static String
STANDALONE
static String
VERSION

Constructor Summary

XMLWriter()
Create a new XML writer.
XMLWriter(Writer writer)
Create a new XML writer.
XMLWriter(XMLReader xmlreader)
Create a new XML writer.
XMLWriter(XMLReader xmlreader, Writer writer)
Create a new XML writer.

Method Summary

void
characters(String data)
Write a string of character data, with XML escaping.
void
characters(ch[] , int start, int len)
Write character data.
void
comment(char[] ch, int start, int length)
void
dataElement(String localName, String content)
Write an element with character data content but no attributes or Namespace URI.
void
dataElement(String uri, String localName, String content)
Write an element with character data content but no attributes.
void
dataElement(String uri, String localName, String qName, Attributes atts, String content)
Write an element with character data content.
void
emptyElement(String localName)
Add an empty element without a Namespace URI, qname or attributes.
void
emptyElement(String uri, String localName)
Add an empty element without a qname or attributes.
void
emptyElement(String uri, String localName, String qName, Attributes atts)
Write an empty element.
void
endCDATA()
void
endDTD()
void
endDocument()
Write a newline at the end of the document.
void
endElement(String localName)
End an element without a Namespace URI or qname.
void
endElement(String uri, String localName)
End an element without a qname.
void
endElement(String uri, String localName, String qName)
Write an end tag.
void
endEntity(String name)
void
flush()
Flush the output.
void
forceNSDecl(String uri)
Force a Namespace to be declared on the root element.
void
forceNSDecl(String uri, String prefix)
Force a Namespace declaration with a preferred prefix.
String
getOutputProperty(String key)
String
getPrefix(String uri)
Get the current or preferred prefix for a Namespace URI.
void
ignorableWhitespace(ch[] , int start, int length)
Write ignorable whitespace.
void
processingInstruction(String target, String data)
Write a processing instruction.
void
reset()
Reset the writer.
void
setOutput(Writer writer)
Set a new output destination for the document.
void
setOutputProperty(String key, String value)
void
setPrefix(String uri, String prefix)
Specify a preferred prefix for a Namespace URI.
void
startCDATA()
void
startDTD(String name, String publicid, String systemid)
void
startDocument()
Write the XML declaration at the beginning of the document.
void
startElement(String localName)
Start a new element without a qname, attributes or a Namespace URI.
void
startElement(String uri, String localName)
Start a new element without a qname or attributes.
void
startElement(String uri, String localName, String qName, Attributes atts)
Write a start tag.
void
startEntity(String name)

Field Details

CDATA_SECTION_ELEMENTS

public static final String CDATA_SECTION_ELEMENTS

DOCTYPE_PUBLIC

public static final String DOCTYPE_PUBLIC

DOCTYPE_SYSTEM

public static final String DOCTYPE_SYSTEM

ENCODING

public static final String ENCODING

INDENT

public static final String INDENT

MEDIA_TYPE

public static final String MEDIA_TYPE

METHOD

public static final String METHOD

OMIT_XML_DECLARATION

public static final String OMIT_XML_DECLARATION

STANDALONE

public static final String STANDALONE

VERSION

public static final String VERSION

Constructor Details

XMLWriter

public XMLWriter()
Create a new XML writer.

Write to standard output.


XMLWriter

public XMLWriter(Writer writer)
Create a new XML writer.

Write to the writer provided.

Parameters:
writer - The output destination, or null to use standard output.

XMLWriter

public XMLWriter(XMLReader xmlreader)
Create a new XML writer.

Use the specified XML reader as the parent.

Parameters:
xmlreader - The parent in the filter chain, or null for no parent.

XMLWriter

public XMLWriter(XMLReader xmlreader,
                 Writer writer)
Create a new XML writer.

Use the specified XML reader as the parent, and write to the specified writer.

Parameters:
xmlreader - The parent in the filter chain, or null for no parent.
writer - The output destination, or null to use standard output.

Method Details

characters

public void characters(String data)
            throws SAXException
Write a string of character data, with XML escaping.

This is a convenience method that takes an XML String, converts it to a character array, then invokes characters(char[], int, int).

Parameters:
data - The character data.
See Also:
characters(char[], int, int)

characters

public void characters(ch[] ,
                       int start,
                       int len)
            throws SAXException
Write character data. Pass the event on down the filter chain for further processing.
Parameters:
start - The starting position in the array.
See Also:
org.xml.sax.ContentHandler.characters

comment

public void comment(char[] ch,
                    int start,
                    int length)
            throws SAXException

dataElement

public void dataElement(String localName,
                        String content)
            throws SAXException
Parameters:
localName - The element's local name.
content - The character data content.

dataElement

public void dataElement(String uri,
                        String localName,
                        String content)
            throws SAXException
Parameters:
uri - The element's Namespace URI.
localName - The element's local name.
content - The character data content.

dataElement

public void dataElement(String uri,
                        String localName,
                        String qName,
                        Attributes atts,
                        String content)
            throws SAXException
Parameters:
uri - The element's Namespace URI.
localName - The element's local name.
qName - The element's default qualified name.
atts - The element's attributes.
content - The character data content.

emptyElement

public void emptyElement(String localName)
            throws SAXException
Parameters:
localName - The element's local name.

emptyElement

public void emptyElement(String uri,
                         String localName)
            throws SAXException
Parameters:
uri - The element's Namespace URI.
localName - The element's local name.

emptyElement

public void emptyElement(String uri,
                         String localName,
                         String qName,
                         Attributes atts)
            throws SAXException
Write an empty element. This method writes an empty element tag rather than a start tag followed by an end tag. Both a startElement and an endElement event will be passed on down the filter chain.
Parameters:
uri - The element's Namespace URI, or the empty string if the element has no Namespace or if Namespace processing is not being performed.
localName - The element's local name (without prefix). This parameter must be provided.
qName - The element's qualified name (with prefix), or the empty string if none is available. This parameter is strictly advisory: the writer may or may not use the prefix attached.
atts - The element's attribute list.
See Also:
startElement, endElement

endCDATA

public void endCDATA()
            throws SAXException

endDTD

public void endDTD()
            throws SAXException

endDocument

public void endDocument()
            throws SAXException
Write a newline at the end of the document. Pass the event on down the filter chain for further processing.
See Also:
org.xml.sax.ContentHandler.endDocument

endElement

public void endElement(String localName)
            throws SAXException
Parameters:
localName - The element's local name.

endElement

public void endElement(String uri,
                       String localName)
            throws SAXException
Parameters:
uri - The element's Namespace URI.
localName - The element's local name.

endElement

public void endElement(String uri,
                       String localName,
                       String qName)
            throws SAXException
Write an end tag. Pass the event on down the filter chain for further processing.
Parameters:
uri - The Namespace URI, or the empty string if none is available.
localName - The element's local (unprefixed) name (required).
qName - The element's qualified (prefixed) name, or the empty string is none is available. This method will use the qName as a template for generating a prefix if necessary, but it is not guaranteed to use the same qName.
See Also:
org.xml.sax.ContentHandler.endElement

endEntity

public void endEntity(String name)
            throws SAXException

flush

public void flush()
            throws IOException
See Also:
reset()

forceNSDecl

public void forceNSDecl(String uri)
Force a Namespace to be declared on the root element.

By default, the XMLWriter will declare only the Namespaces needed for an element; as a result, a Namespace may be declared many places in a document if it is not used on the root element.

This method forces a Namespace to be declared on the root element even if it is not used there, and reduces the number of xmlns attributes in the document.

Parameters:
uri - The Namespace URI to declare.

forceNSDecl

public void forceNSDecl(String uri,
                        String prefix)
Parameters:
uri - The Namespace URI to declare on the root element.
prefix - The preferred prefix for the Namespace, or "" for the default Namespace.
See Also:
setPrefix(String,String), forceNSDecl(java.lang.String)

getOutputProperty

public String getOutputProperty(String key)

getPrefix

public String getPrefix(String uri)
Get the current or preferred prefix for a Namespace URI.
Parameters:
uri - The Namespace URI.
Returns:
The preferred prefix, or "" for the default Namespace.

ignorableWhitespace

public void ignorableWhitespace(ch[] ,
                                int start,
                                int length)
            throws SAXException
Write ignorable whitespace. Pass the event on down the filter chain for further processing.
Parameters:
start - The starting position in the array.
length - The number of characters to write.
See Also:
org.xml.sax.ContentHandler.ignorableWhitespace

processingInstruction

public void processingInstruction(String target,
                                  String data)
            throws SAXException
Write a processing instruction. Pass the event on down the filter chain for further processing.
Parameters:
target - The PI target.
data - The PI data.
See Also:
org.xml.sax.ContentHandler.processingInstruction

reset

public void reset()
Reset the writer.

This method is especially useful if the writer throws an exception before it is finished, and you want to reuse the writer for a new document. It is usually a good idea to invoke flush before resetting the writer, to make sure that no output is lost.

This method is invoked automatically by the startDocument method before writing a new document.

Note: this method will not clear the prefix or URI information in the writer or the selected output writer.

See Also:
flush()

setOutput

public void setOutput(Writer writer)
Set a new output destination for the document.
Parameters:
writer - The output destination, or null to use standard output.

setOutputProperty

public void setOutputProperty(String key,
                              String value)

setPrefix

public void setPrefix(String uri,
                      String prefix)
Specify a preferred prefix for a Namespace URI.

Note that this method does not actually force the Namespace to be declared; to do that, use the forceNSDecl method as well.

Parameters:
uri - The Namespace URI.
prefix - The preferred prefix, or "" to select the default Namespace.
See Also:
getPrefix(String), forceNSDecl(java.lang.String), forceNSDecl(java.lang.String,java.lang.String)

startCDATA

public void startCDATA()
            throws SAXException

startDTD

public void startDTD(String name,
                     String publicid,
                     String systemid)
            throws SAXException

startDocument

public void startDocument()
            throws SAXException
Write the XML declaration at the beginning of the document. Pass the event on down the filter chain for further processing.
See Also:
org.xml.sax.ContentHandler.startDocument

startElement

public void startElement(String localName)
            throws SAXException
Start a new element without a qname, attributes or a Namespace URI.

This method will provide an empty string for the Namespace URI, and empty string for the qualified name, and a default empty attribute list. It invokes #startElement(String, String, String, Attributes)} directly.

Parameters:
localName - The element's local name.

startElement

public void startElement(String uri,
                         String localName)
            throws SAXException
Parameters:
uri - The element's Namespace URI.
localName - The element's local name.

startElement

public void startElement(String uri,
                         String localName,
                         String qName,
                         Attributes atts)
            throws SAXException
Write a start tag. Pass the event on down the filter chain for further processing.
Parameters:
uri - The Namespace URI, or the empty string if none is available.
localName - The element's local (unprefixed) name (required).
qName - The element's qualified (prefixed) name, or the empty string is none is available. This method will use the qName as a template for generating a prefix if necessary, but it is not guaranteed to use the same qName.
atts - The element's attribute list (must not be null).
See Also:
org.xml.sax.ContentHandler.startElement

startEntity

public void startEntity(String name)
            throws SAXException

Licence: Academic Free License 3.0 and/or GPL 2.0