org.apache.lucene.benchmark.byTask.feeds
Class DemoHTMLParser
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.DemoHTMLParser
- All Implemented Interfaces:
- HTMLParser
public class DemoHTMLParser
- extends java.lang.Object
- implements HTMLParser
HTML Parser that is based on Lucene's demo HTML parser.
Method Summary |
DocData |
parse(java.lang.String name,
java.util.Date date,
java.io.Reader reader,
java.text.DateFormat dateFormat)
Parse the input Reader and return DocData. |
DocData |
parse(java.lang.String name,
java.util.Date date,
java.lang.StringBuffer inputText,
java.text.DateFormat dateFormat)
Parse the inputText and return DocData. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DemoHTMLParser
public DemoHTMLParser()
parse
public DocData parse(java.lang.String name,
java.util.Date date,
java.io.Reader reader,
java.text.DateFormat dateFormat)
throws java.io.IOException,
java.lang.InterruptedException
- Description copied from interface:
HTMLParser
- Parse the input Reader and return DocData.
A provided name or date is used for the result, otherwise an attempt is
made to set them from the parsed data.
- Specified by:
parse
in interface HTMLParser
- Parameters:
name
- name of the result doc data. If null, attempt to set by parsed data.date
- date of the result doc data. If null, attempt to set by parsed data.reader
- of html text to parse.dateFormat
- date formatter to use for extracting the date.
- Returns:
- Parsed doc data.
- Throws:
java.io.IOException
java.lang.InterruptedException
parse
public DocData parse(java.lang.String name,
java.util.Date date,
java.lang.StringBuffer inputText,
java.text.DateFormat dateFormat)
throws java.io.IOException,
java.lang.InterruptedException
- Description copied from interface:
HTMLParser
- Parse the inputText and return DocData.
- Specified by:
parse
in interface HTMLParser
inputText
- the html text to parse.
- Throws:
java.io.IOException
java.lang.InterruptedException
- See Also:
HTMLParser.parse(String, Date, Reader, DateFormat)
Copyright © 2000-2009 Apache Software Foundation. All Rights Reserved.