org.lobobrowser.html.parser

Class HtmlParser


public class HtmlParser
extends java.lang.Object

The HtmlParser class is an HTML DOM parser. This parser provides the functionality for the standard DOM parser implementation DocumentBuilderImpl. This parser class may be used directly when a different DOM implementation is preferred.

Field Summary

static String
MODIFYING_KEY
A node UserData key used to tell nodes that their content may be about to be modified.

Constructor Summary

HtmlParser(UserAgentContext ucontext, HTMLDocument document)
Constructs a HtmlParser.
HtmlParser(UserAgentContext ucontext, HTMLDocument document, ErrorHandler errorHandler, String publicId, String systemId)
Constructs a HtmlParser.
HtmlParser(HTMLDocument document, ErrorHandler errorHandler, String publicId, String systemId)
Deprecated. UserAgentContext should be passed in constructor.

Method Summary

void
parse(InputStream in)
Parses HTML from an input stream, assuming the character set is ISO-8859-1.
void
parse(InputStream in, String charset)
Parses HTML from an input stream, using the given character set.
void
parse(LineNumberReader reader)
void
parse(LineNumberReader reader, Node parent)
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.
void
parse(Reader reader)
Parses HTML given by a Reader.
void
parse(Reader reader, Node parent)
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.

Field Details

MODIFYING_KEY

public static final String MODIFYING_KEY
A node UserData key used to tell nodes that their content may be about to be modified. Elements could use this to temporarily suspend notifications. The value set will be either Boolean.TRUE or Boolean.FALSE.

Constructor Details

HtmlParser

public HtmlParser(UserAgentContext ucontext,
                  HTMLDocument document)
Constructs a HtmlParser.
Parameters:
ucontext - The user agent context.
document - An instanceof of HTMLDocument.

HtmlParser

public HtmlParser(UserAgentContext ucontext,
                  HTMLDocument document,
                  ErrorHandler errorHandler,
                  String publicId,
                  String systemId)
Constructs a HtmlParser.
Parameters:
ucontext - The user agent context.
document - An instanceof of HTMLDocument.
errorHandler - The error handler.
publicId - The public ID of the document.
systemId - The system ID of the document.

HtmlParser

public HtmlParser(HTMLDocument document,
                  ErrorHandler errorHandler,
                  String publicId,
                  String systemId)

Deprecated. UserAgentContext should be passed in constructor.

Constructs a HtmlParser.
Parameters:
document - An instanceof of HTMLDocument.
errorHandler - The error handler.
publicId - The public ID of the document.
systemId - The system ID of the document.

Method Details

parse

public void parse(InputStream in)
            throws IOException,
                   SAXException,
                   UnsupportedEncodingException
Parses HTML from an input stream, assuming the character set is ISO-8859-1.
Parameters:
in - The input stream.

parse

public void parse(InputStream in,
                  String charset)
            throws IOException,
                   SAXException,
                   UnsupportedEncodingException
Parses HTML from an input stream, using the given character set.
Parameters:
in - The input stream.
charset - The character set.

parse

public void parse(LineNumberReader reader)
            throws IOException,
                   SAXException

parse

public void parse(LineNumberReader reader,
                  Node parent)
            throws IOException,
                   SAXException
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.
Parameters:
reader - A LineNumberReader for the document.
parent - The root node for the parsed DOM.

parse

public void parse(Reader reader)
            throws IOException,
                   SAXException
Parses HTML given by a Reader. This method appends nodes to the document provided to the parser.
Parameters:
reader - An instance of Reader.

parse

public void parse(Reader reader,
                  Node parent)
            throws IOException,
                   SAXException
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.
Parameters:
reader - A document reader.
parent - The root node for the parsed DOM.