Prev Class | Next Class | Frames | No Frames |
Summary: Nested | Field | Method | Constr | Detail: Nested | Field | Method | Constr |
java.lang.Object
au.id.jericho.lib.html.Segment
au.id.jericho.lib.html.Element
StartTagType.NORMAL
.
This comprises all HTML elements and non-HTML elements.
Element
instances are obtained using one of the following methods:
StartTag.getElement()
EndTag.getElement()
Segment.findAllElements()
Segment.findAllElements(String name)
Segment.findAllElements(StartTagType)
HTMLElements
class, and the
XML 1.0 specification for elements.
getEndTag()
==null
isEmpty()
==true
getEnd()
==
getStartTag()
.
getEnd()
getEndTag()
!=null
isEmpty()
==false
getEnd()
==
getEndTag()
.
getEnd()
getEndTag()
==null
isEmpty()
==false
getEnd()
!=
getStartTag()
.
getEnd()
HTML elementend tag is optionalimplicitly terminating tagstart tagsingle tag elementelement parsing rules for HTML elements with optional end tagsHTMLElements.getEndTagOptionalElementNames()
StartTag.getElement()
method to construct an element.
The detection of the start tag's matching end tag or other terminating tags always takes into account the possible nesting of elements.
StartTagType.NORMAL
:
isEmptyElementTag()
method for more information.
StartTagType.NORMAL
:
HTMLElements
Fields inherited from interface au.id.jericho.lib.html.HTMLElementName | |
A , ABBR , ACRONYM , ADDRESS , APPLET , AREA , B , BASE , BASEFONT , BDO , BIG , BLOCKQUOTE , BODY , BR , BUTTON , CAPTION , CENTER , CITE , CODE , COL , COLGROUP , DD , DEL , DFN , DIR , DIV , DL , DT , EM , FIELDSET , FONT , FORM , FRAME , FRAMESET , H1 , H2 , H3 , H4 , H5 , H6 , HEAD , HR , HTML , I , IFRAME , IMG , INPUT , INS , ISINDEX , KBD , LABEL , LEGEND , LI , LINK , MAP , MENU , META , NOFRAMES , NOSCRIPT , OBJECT , OL , OPTGROUP , OPTION , P , PARAM , PRE , Q , S , SAMP , SCRIPT , SELECT , SMALL , SPAN , STRIKE , STRONG , STYLE , SUB , SUP , TABLE , TBODY , TD , TEXTAREA , TFOOT , TH , THEAD , TITLE , TR , TT , U , UL , VAR |
Method Summary | |
String |
|
Attributes |
|
List |
|
Segment |
|
String |
|
String |
|
int |
|
EndTag |
|
FormControl |
|
String | |
Element |
|
StartTag |
|
static boolean |
|
boolean | |
boolean |
|
static boolean |
|
Methods inherited from class au.id.jericho.lib.html.Segment | |
charAt , compareTo , encloses , encloses , equals , extractText , extractText , findAllCharacterReferences , findAllComments , findAllElements , findAllElements , findAllElements , findAllStartTags , findAllStartTags , findAllStartTags , findAllTags , findAllTags , findFormControls , findFormFields , findWords , getBegin , getChildElements , getDebugInfo , getEnd , getSourceText , getSourceTextNoWhitespace , hashCode , ignoreWhenParsing , isComment , isWhiteSpace , isWhiteSpace , length , parseAttributes , subSequence , toString |
public String getAttributeValue(String attributeName)
Returns the decoded value of the attribute with the specified name (case insensitive). Returnsnull
if the start tag of this element does not have attributes, no attribute with the specified name exists or the attribute has no value. This is equivalent togetStartTag()
.
getAttributeValue(attributeName)
.
- Parameters:
attributeName
- the name of the attribute to get.
- Returns:
- the decoded value of the attribute with the specified name, or
null
if the attribute does not exist or has no value.
public Attributes getAttributes()
Returns the attributes specified in this element's start tag. This is equivalent togetStartTag()
.
getAttributes()
.
- Returns:
- the attributes specified in this element's start tag.
- See Also:
StartTag.getAttributes()
public final List getChildElements()
Returns a list of the immediate children of this element in the document element hierarchy. The objects in the list are all of typeElement
. See theSource.getChildElements()
method for more details.
- Overrides:
- getChildElements in interface Segment
- Returns:
- a list of the immediate children of this element in the document element hierarchy, guaranteed not
null
.
- See Also:
getParentElement()
public Segment getContent()
Returns the segment representing the content of the element. This segment spans between the end of the start tag and the start of the end tag. If the end tag is not present, the content reaches to the end of the element. Note that before version 2.0 this method returnednull
if the element was empty, whereas now a zero-length segment is returned.
- Returns:
- the segment representing the content of the element, guaranteed not
null
.
public String getContentText()
Deprecated. Use
isEmpty()
? null :
getContent()
.
toString()
instead.Returns the content text of the element. This method has been deprecated as of version 2.0 as theSegment
returned by thegetContent()
method now implementsCharSequence
and can be used directly in many cases. UsegetContent()
.
toString()
if aString
is required.
- Returns:
- the content text of the element, or
null
if the element is empty.
public String getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
- Overrides:
- getDebugInfo in interface Segment
- Returns:
- a string representation of this object useful for debugging purposes.
public int getDepth()
Returns the nesting depth of this element in the document element hierarchy. TheSource.fullSequentialParse()
method should be called after construction of theSource
object if this method is to be used. A top-level element has a nesting depth of0
. An element formed from a server tag always have a nesting depth of0
, regardless of whether it is nested inside a normal element. See theSource.getChildElements()
method for more details.
- Returns:
- the nesting depth of this element in the document element hierarchy.
- See Also:
getParentElement()
public EndTag getEndTag()
Returns the end tag of the element. If the element has no end tag this method returnsnull
.
- Returns:
- the end tag of the element, or
null
if the element has no end tag.
public FormControl getFormControl()
Returns theFormControl
defined by this element.
- Returns:
- the
FormControl
defined by this element, ornull
if it is not a control.
public String getName()
Returns the name of the start tag of this element, always in lower case. This is equivalent togetStartTag()
.
getName()
. See theTag.getName()
method for more information.
- Returns:
- the name of the start tag of this element, always in lower case.
public Element getParentElement()
Returns the parent of this element in the document element hierarchy. TheSource.fullSequentialParse()
method should be called after construction of theSource
object if this method is to be used. This method returnsnull
for a top-level element, as well as any element formed from a server tag, regardless of whether it is nested inside a normal element. See theSource.getChildElements()
method for more details.
- Returns:
- the parent of this element in the document element hierarchy, or
null
if this element is a top-level element.
- See Also:
getChildElements()
public StartTag getStartTag()
Returns the start tag of the element.
- Returns:
- the start tag of the element.
public static boolean isBlock(String elementName)
Deprecated. Use
HTMLElements.getBlockLevelElementNames()
.contains(elementName.toLowerCase())
instead.Indicates whether the specified element name is an HTML block-level element. This method has been deprecated as of version 2.0 as theHTMLElements.getBlockLevelElementNames()
method now provides a complete set of the element names for which this method returnstrue
.
- Parameters:
elementName
- an element name.
- Returns:
true
if the specified element name is an HTML block-level element, otherwisefalse
.
public boolean isEmpty()
Indicates whether this element has zero-length content. This is equivalent togetContent()
.
length()
==0
. Note that this is a broader definition than that of both the HTML definition of an empty element, which is only those elements whose end tag is forbidden, and the XML definition of an empty element, which is "either a start-tag immediately followed by an end-tag, or an empty-element tag". The other possibility covered by this property is the case of an HTML element with an optional end tag that is immediately followed by another tag that implicitly terminates the element.
- Returns:
true
if this element has zero-length content, otherwisefalse
.
- See Also:
isEmptyElementTag()
public boolean isEmptyElementTag()
Indicates whether this element is an empty-element tag. It is signified by an empty element with the characters "/>
" at the end of the start tag. This is equivalent toisEmpty()
&&
getStartTag()
.
isEmptyElementTag()
. TheStartTag.isEmptyElementTag()
property only checks whether the start tag syntactically an empty-element tag, whereas this property also makes sure the element is in fact empty. A syntactical empty-element tag that is not actually empty can occur if the end tag of an HTML element is either required or optional, but the start tag is erroneously terminated with the characters "/>
" in the source document. All major browsers ignore the syntactical hint of an empty element in this case, even in an XHTML document, so this parser does the same.
- Returns:
true
if this element is an empty-element tag, otherwisefalse
.
public static boolean isInline(String elementName)
Deprecated. Use
HTMLElements.getInlineLevelElementNames()
.contains(elementName.toLowerCase())
instead.Indicates whether the specified element name is an HTML inline-level element. This method has been deprecated as of version 2.0 as theHTMLElements.getInlineLevelElementNames()
method now provides a complete set of the element names for which this method returnstrue
.
- Parameters:
elementName
- an element name.
- Returns:
true
if the specified element name is an HTML inline-level element, otherwisefalse
.