de.zeigermann.xml.simpleImporter

Class SimpleImporter


public class SimpleImporter
extends Object

Simple and fast importer for XML configuration or import files.

It is based on SAX and can be considered an extension to it. This means it is callback oriented and does not build an internal data structure like the DOM. While SAX is simple, fast, and memory friendly it might be a bit too rudimentary for most tasks. SimpleImporter adds more high level means for importing XML while preserving the SAX's benefits.

As with SAX you register a callback handler (SimpleImportHandler) that is called upon events. Consider the following example implementation of a SimpleImportHandler:

 public class DemoHandler implements SimpleImportHandler { 
 public void startDocument() { }
 public void endDocument() { }
 
 public void cData(SimplePath path, String cdata) { }
 
 public void startElement(SimplePath path, String name, AttributesImpl attributes, String leadingCDdata) {
   if (path.matches("/root/interesting-element")) {
     System.out.println(leadingCDdata);
   }
 }
 public void endElement(SimplePath path, String name) { }
 
 }
 
Registering this class with addSimpleImportHandler(SimpleImportHandler) and call parse(InputSource) on an input stream or parseUrlOrFile(String) will dump the leading text of the element matching the path (SimplePath) "/root/interesting-element".

Note: This class is thread safe.
Author:
Olli Z.

Field Summary

protected List
callbackHandlerList
protected de.zeigermann.xml.simpleImporter.SimpleImporter.ParseElement
currentElement
protected StringBuffer
currentMixedPCData
protected String
debugBuffer
protected SAXParserFactory
factory
protected StringBuffer
firstPCData
protected boolean
foundMixedPCData
protected boolean
isFirstPCData
protected de.zeigermann.xml.simpleImporter.SimpleImporter.PathStack
parseStack

Constructor Summary

SimpleImporter()
Creates a new SimpleImporter object having default property settings.

Method Summary

void
addSimpleImportHandler(SimpleImportHandler callbackHandler)
Adds a new callback handler if it is not in the callback list, yet.
boolean
getBuildComplexPath()
Determines if the simple path created will have complex additional info.
boolean
getFoundMixedPCData()
Determines if we have found any mixed content while parsing.
boolean
getFullDebugMode()
Gets the property described in setFullDebugMode(boolean).
boolean
getIncludeLeadingCDataIntoStartElementCallback()
Gets property telling importer to return any leading CDATA, i.e.
boolean
getMakeCopy()
Gets the property describing if every callback handler gets a fresh copy of the parsed data.
String
getParsedStreamForDebug()
Gets the whole stream parsed in the parse(InputSource) method.
boolean
getTrimContent()
Sets the property described in setTrimContent(boolean).
boolean
getUseQName()
Determines if the path shall be assembled of the full qualified names.
boolean
getZeroLengthIsNull()
Gets property: When findind zero length content should it be treated as null data? If it is treated as null data nothing is reported to handlers when finding zero length data.
void
parse(InputSource is)
Parses the input source using the standard SAX parser and calls back the callback handlers.
void
parseUrlOrFile(String urlOrFileName)
Tries to parse the file or URL named by parameter urlOrFileName.
void
removeSimpleImportHandler(SimpleImportHandler callbackHandler)
Removes a callback handler if it is in the callback list.
void
setBuildComplexPath(boolean buildComplexPath)
Sets if the simple path created will have complex additional info.
void
setFullDebugMode(boolean fullDebug)
Sets the full debug mode which enables us to get the parsed stream as string via the getParsedStreamForDebug() method even if an error occured.
void
setIncludeLeadingCDataIntoStartElementCallback(boolean includeLeadingCDataIntoStartElementCallback)
Sets the property described in getIncludeLeadingCDataIntoStartElementCallback().
void
setMakeCopy(boolean makeCopy)
Sets the property described in getMakeCopy().
void
setTrimContent(boolean trimContent)
Sets when all content shall be trimed.
void
setUseQName(boolean useQName)
Sets if the path shall be assembled of the full qualified names.
void
setZeroLengthIsNull(boolean zeroLengthIsNull)
Sets the property described in getZeroLengthIsNull().

Field Details

callbackHandlerList

protected List callbackHandlerList

currentElement

protected de.zeigermann.xml.simpleImporter.SimpleImporter.ParseElement currentElement

currentMixedPCData

protected StringBuffer currentMixedPCData

debugBuffer

protected String debugBuffer

factory

protected SAXParserFactory factory

firstPCData

protected StringBuffer firstPCData

foundMixedPCData

protected boolean foundMixedPCData

isFirstPCData

protected boolean isFirstPCData

parseStack

protected de.zeigermann.xml.simpleImporter.SimpleImporter.PathStack parseStack

Constructor Details

SimpleImporter

public SimpleImporter()
Creates a new SimpleImporter object having default property settings. It is recommended to set all properties explicitly for clearity.

Method Details

addSimpleImportHandler

public void addSimpleImportHandler(SimpleImportHandler callbackHandler)
Adds a new callback handler if it is not in the callback list, yet. This can be dynamically done while parsing.

getBuildComplexPath

public boolean getBuildComplexPath()
Determines if the simple path created will have complex additional info.

getFoundMixedPCData

public boolean getFoundMixedPCData()
Determines if we have found any mixed content while parsing.

getFullDebugMode

public boolean getFullDebugMode()

getIncludeLeadingCDataIntoStartElementCallback

public boolean getIncludeLeadingCDataIntoStartElementCallback()

getMakeCopy

public boolean getMakeCopy()
Gets the property describing if every callback handler gets a fresh copy of the parsed data. This is only important when there is more than one callback handler. If so and it is not set, all handlers will get identical objects. This is bad if you expect them to change any of that data.

getParsedStreamForDebug

public String getParsedStreamForDebug()
Gets the whole stream parsed in the parse(InputSource) method. As this requires some actions significantly slowing down the whole parse, this only works if it has been enabled by the the setFullDebugMode(boolean) method.

getTrimContent

public boolean getTrimContent()

getUseQName

public boolean getUseQName()
Determines if the path shall be assembled of the full qualified names. true is the default.

getZeroLengthIsNull

public boolean getZeroLengthIsNull()
Gets property: When findind zero length content should it be treated as null data? If it is treated as null data nothing is reported to handlers when finding zero length data.

parse

public void parse(InputSource is)
            throws ParserConfigurationException,
                   SAXException,
                   IOException
Parses the input source using the standard SAX parser and calls back the callback handlers. If enabled with setFullDebugMode(boolean) the source will be verbosely copied first.

Note: This method is synchronized, so you can not have two concurrent parses.

parseUrlOrFile

public void parseUrlOrFile(String urlOrFileName)
            throws ParserConfigurationException,
                   SAXException,
                   IOException,
                   SimpleImporterException
Tries to parse the file or URL named by parameter urlOrFileName. First it tries to parse it as URL, if this does not work, it tries to parse it as file. If one option works, an input stream will be opened and parse(InputSource) will be called with it. If both does not work, an exception is thrown.

removeSimpleImportHandler

public void removeSimpleImportHandler(SimpleImportHandler callbackHandler)
Removes a callback handler if it is in the callback list. This can be dynamically done while parsing.

setBuildComplexPath

public void setBuildComplexPath(boolean buildComplexPath)
Sets if the simple path created will have complex additional info.

setFullDebugMode

public void setFullDebugMode(boolean fullDebug)

setIncludeLeadingCDataIntoStartElementCallback

public void setIncludeLeadingCDataIntoStartElementCallback(boolean includeLeadingCDataIntoStartElementCallback)

setMakeCopy

public void setMakeCopy(boolean makeCopy)

setTrimContent

public void setTrimContent(boolean trimContent)
Sets when all content shall be trimed. If set in conjunction with setZeroLengthIsNull(boolean) all whitespace data will not be reported to callback handlers.

setUseQName

public void setUseQName(boolean useQName)
Sets if the path shall be assembled of the full qualified names. true is the default.

setZeroLengthIsNull

public void setZeroLengthIsNull(boolean zeroLengthIsNull)

Copyright B) 2002-2004 Oliver Zeigermann. All Rights Reserved.