au.id.jericho.lib.html

Class ParseText

Implemented Interfaces:
CharSequence

public final class ParseText
extends java.lang.Object
implements CharSequence

Represents the text from the source document that is to be parsed.

This class is normally only of interest to users who wish to create custom tag types.

The parse text is defined as the entire text of the source document in lower case, with all ignored segments replaced by space characters.

The text is stored in lower case to make case insensitive parsing as efficient as possible.

This class provides many methods which are also provided by the java.lang.String class, but adds an extra parameter called breakAtIndex to the various indexOf methods. This parameter allows a search on only a specified segment of the text, which is not possible using the normal String class.

ParseText instances are obtained using the Source.getParseText() method.

Field Summary

static int
NO_BREAK
A value to use as the breakAtIndex argument in certain methods to indicate that the search should continue to the start or end of the parse text.

Method Summary

char
charAt(int index)
Returns the character at the specified index.
boolean
containsAt(String str, int pos)
Indicates whether this parse text contains the specified string at the specified position.
int
indexOf(String searchString, int fromIndex)
Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex.
int
indexOf(String searchString, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
indexOf(char searchChar, int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex.
int
indexOf(char searchChar, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
indexOf(char[] searchCharArray, int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex.
int
indexOf(char[] searchCharArray, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
lastIndexOf(String searchString, int fromIndex)
Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex.
int
lastIndexOf(String searchString, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
lastIndexOf(char searchChar, int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex.
int
lastIndexOf(char searchChar, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
lastIndexOf(char[] searchCharArray, int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex.
int
lastIndexOf(char[] searchCharArray, int fromIndex, int breakAtIndex)
Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex, and breaking the search at the index specified by breakAtIndex.
int
length()
Returns the length of the parse text.
CharSequence
subSequence(int beginIndex, int endIndex)
Returns a new character sequence that is a subsequence of this sequence.
String
substring(int beginIndex, int endIndex)
Returns a new string that is a substring of this parse text.
String
toString()
Returns the content of the parse text as a String.

Field Details

NO_BREAK

public static final int NO_BREAK
A value to use as the breakAtIndex argument in certain methods to indicate that the search should continue to the start or end of the parse text.
Field Value:
-1

Method Details

charAt

public char charAt(int index)
Returns the character at the specified index.
Parameters:
index - the index of the character.
Returns:
the character at the specified index, which is always in lower case.

containsAt

public boolean containsAt(String str,
                          int pos)
Indicates whether this parse text contains the specified string at the specified position.

This method is analogous to the java.lang.String.startsWith(String prefix, int toffset) method.

Parameters:
str - a string.
pos - the position (index) in this parse text at which to check for the specified string.
Returns:
true if this parse text contains the specified string at the specified position, otherwise false.

indexOf

public int indexOf(String searchString,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified string, starting the search at the position specified by fromIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified string within the specified range, or -1 if the string is not found.

indexOf

public int indexOf(String searchString,
                   int fromIndex,
                   int breakAtIndex)
Parameters:
searchString - a string.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified string within the specified range, or -1 if the string is not found.

indexOf

public int indexOf(char searchChar,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character, starting the search at the position specified by fromIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified character within the specified range, or -1 if the character is not found.

indexOf

public int indexOf(char searchChar,
                   int fromIndex,
                   int breakAtIndex)
Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified character within the specified range, or -1 if the character is not found.

indexOf

public int indexOf(char[] searchCharArray,
                   int fromIndex)
Returns the index within this parse text of the first occurrence of the specified character array, starting the search at the position specified by fromIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the first occurrence of the specified character array within the specified range, or -1 if the character array is not found.

indexOf

public int indexOf(char[] searchCharArray,
                   int fromIndex,
                   int breakAtIndex)
Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the end of the text.
Returns:
the index within this parse text of the first occurrence of the specified character array within the specified range, or -1 if the character array is not found.

lastIndexOf

public int lastIndexOf(String searchString,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified string, searching backwards starting at the position specified by fromIndex.

If the specified string is not found then -1 is returned.

Parameters:
searchString - a string.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified string within the specified range, or -1 if the string is not found.

lastIndexOf

public int lastIndexOf(String searchString,
                       int fromIndex,
                       int breakAtIndex)
Parameters:
searchString - a string.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified string within the specified range, or -1 if the string is not found.

lastIndexOf

public int lastIndexOf(char searchChar,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character, searching backwards starting at the position specified by fromIndex.

If the specified character is not found then -1 is returned.

Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified character within the specified range, or -1 if the character is not found.

lastIndexOf

public int lastIndexOf(char searchChar,
                       int fromIndex,
                       int breakAtIndex)
Parameters:
searchChar - a character.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified character within the specified range, or -1 if the character is not found.

lastIndexOf

public int lastIndexOf(char[] searchCharArray,
                       int fromIndex)
Returns the index within this parse text of the last occurrence of the specified character array, searching backwards starting at the position specified by fromIndex.

If the specified character array is not found then -1 is returned.

Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
Returns:
the index within this parse text of the last occurrence of the specified character array within the specified range, or -1 if the character array is not found.

lastIndexOf

public int lastIndexOf(char[] searchCharArray,
                       int fromIndex,
                       int breakAtIndex)
Parameters:
searchCharArray - a character array.
fromIndex - the index to start the search from.
breakAtIndex - the index at which to break off the search, or NO_BREAK if the search is to continue to the start of the text.
Returns:
the index within this parse text of the last occurrence of the specified character array within the specified range, or -1 if the character array is not found.

length

public int length()
Returns the length of the parse text.
Returns:
the length of the parse text.

subSequence

public CharSequence subSequence(int beginIndex,
                                int endIndex)
Parameters:
beginIndex - the begin index, inclusive.
endIndex - the end index, exclusive.
Returns:
a new character sequence that is a subsequence of this sequence.

substring

public String substring(int beginIndex,
                        int endIndex)
Returns a new string that is a substring of this parse text.

The substring begins at the specified beginIndex and extends to the character at index endIndex - 1. Thus the length of the substring is endIndex-beginIndex.

Parameters:
beginIndex - the begin index, inclusive.
endIndex - the end index, exclusive.
Returns:
a new string that is a substring of this parse text.

toString

public String toString()
Returns the content of the parse text as a String.
Returns:
the content of the parse text as a String.