au.id.jericho.lib.html

Class EndTag

Implemented Interfaces:
CharSequence, Comparable, HTMLElementName

public final class EndTag
extends Tag

Represents the end tag of an element in a specific source document.

An end tag always has a type that is a subclass of EndTagType, meaning it always starts with the characters '</'.

EndTag instances are obtained using one of the following methods:

The Tag superclass defines the getName() method used to get the name of this end tag.

See also the XML 1.0 specification for end tags.

See Also:
Tag, StartTag, Element

Field Summary

Fields inherited from class au.id.jericho.lib.html.Tag

DOCTYPE_DECLARATION, PROCESSING_INSTRUCTION, SERVER_COMMON, SERVER_MASON_COMPONENT_CALL, SERVER_MASON_COMPONENT_CALLED_WITH_CONTENT, SERVER_MASON_NAMED_BLOCK, SERVER_PHP, XML_DECLARATION

Fields inherited from interface au.id.jericho.lib.html.HTMLElementName

A, ABBR, ACRONYM, ADDRESS, APPLET, AREA, B, BASE, BASEFONT, BDO, BIG, BLOCKQUOTE, BODY, BR, BUTTON, CAPTION, CENTER, CITE, CODE, COL, COLGROUP, DD, DEL, DFN, DIR, DIV, DL, DT, EM, FIELDSET, FONT, FORM, FRAME, FRAMESET, H1, H2, H3, H4, H5, H6, HEAD, HR, HTML, I, IFRAME, IMG, INPUT, INS, ISINDEX, KBD, LABEL, LEGEND, LI, LINK, MAP, MENU, META, NOFRAMES, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, P, PARAM, PRE, Q, S, SAMP, SCRIPT, SELECT, SMALL, SPAN, STRIKE, STRONG, STYLE, SUB, SUP, TABLE, TBODY, TD, TEXTAREA, TFOOT, TH, THEAD, TITLE, TR, TT, U, UL, VAR

Method Summary

static String
generateHTML(String tagName)
Generates the HTML text of a normal end tag with the specified tag name.
String
getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
Element
getElement()
Returns the element that is ended by this end tag.
EndTagType
getEndTagType()
Returns the type of this end tag.
TagType
getTagType()
Returns the type of this tag.
static boolean
isForbidden(String name)
Deprecated. Use HTMLElements.getEndTagForbiddenElementNames().contains(name.toLowerCase()) instead.
static boolean
isOptional(String name)
Deprecated. Use HTMLElements.getEndTagOptionalElementNames().contains(name.toLowerCase()) instead.
static boolean
isRequired(String name)
Deprecated. Use HTMLElements.getEndTagRequiredElementNames().contains(name.toLowerCase()) instead.
boolean
isUnregistered()
Indicates whether this tag has a syntax that does not match any of the registered tag types.
String
regenerateHTML()
Deprecated. Use tidy() instead.
String
tidy()
Returns an XML representation of this end tag.

Methods inherited from class au.id.jericho.lib.html.Tag

findNextTag, findPreviousTag, getElement, getName, getNameSegment, getTagType, getUserData, isUnregistered, isXMLName, isXMLNameChar, isXMLNameStartChar, regenerateHTML, setUserData, tidy

Methods inherited from class au.id.jericho.lib.html.Segment

charAt, compareTo, encloses, encloses, equals, extractText, extractText, findAllCharacterReferences, findAllComments, findAllElements, findAllElements, findAllElements, findAllStartTags, findAllStartTags, findAllStartTags, findAllTags, findAllTags, findFormControls, findFormFields, findWords, getBegin, getChildElements, getDebugInfo, getEnd, getSourceText, getSourceTextNoWhitespace, hashCode, ignoreWhenParsing, isComment, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString

Method Details

generateHTML

public static String generateHTML(String tagName)
Generates the HTML text of a normal end tag with the specified tag name.

Example:

The following method call:

EndTag.generateHTML("INPUT")

returns the following output:

</INPUT>

Parameters:
tagName - the name of the end tag.
Returns:
the HTML text of a normal end tag with the specified tag name.
See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)

getDebugInfo

public String getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
Overrides:
getDebugInfo in interface Segment
Returns:
a string representation of this object useful for debugging purposes.

getElement

public Element getElement()
Returns the element that is ended by this end tag.

Returns null if this end tag is not properly matched to any start tag in the source document.

This method is much less efficient than the StartTag.getElement() method.

IMPLEMENTATION NOTE: The explanation for why this method is relatively inefficient lies in the fact that more than one start tag type can have the same corresponding end tag type, so it is not possible to know for certain which type of start tag this end tag is matched to (see EndTagType.getCorrespondingStartTagType() for more explanation). Because of this uncertainty, the implementation of this method must check every start tag preceding this end tag, calling its StartTag.getElement() method to see whether it is terminated by this end tag.

Overrides:
getElement in interface Tag
Returns:
the element that is ended by this end tag.

getEndTagType

public EndTagType getEndTagType()
Returns the type of this end tag.

This is equivalent to (EndTagType)getTagType().

Returns:
the type of this end tag.

getTagType

public TagType getTagType()
Returns the type of this tag.
Overrides:
getTagType in interface Tag
Returns:
the type of this tag.

isForbidden

public static boolean isForbidden(String name)

Deprecated. Use HTMLElements.getEndTagForbiddenElementNames().contains(name.toLowerCase()) instead.

Indicates whether the end tag of an HTML element with the specified name is forbidden.

This method has been deprecated as of version 2.0 and replaced with the HTMLElements.getEndTagForbiddenElementNames() method.

Returns:
true if the end tag of an HTML element with the specified name is forbidden, otherwise false.

isOptional

public static boolean isOptional(String name)

Deprecated. Use HTMLElements.getEndTagOptionalElementNames().contains(name.toLowerCase()) instead.

Indicates whether the end tag of an HTML element with the specified name is optional.

This method has been deprecated as of version 2.0 and replaced with the HTMLElements.getEndTagOptionalElementNames() method.

Returns:
true if the end tag of an HTML element with the specified name is optional, otherwise false.

isRequired

public static boolean isRequired(String name)

Deprecated. Use HTMLElements.getEndTagRequiredElementNames().contains(name.toLowerCase()) instead.

Indicates whether the end tag of an HTML element with the specified name is required.

This method has been deprecated as of version 2.0 and replaced with the HTMLElements.getEndTagRequiredElementNames() method.

Returns:
true if the end tag of an HTML element with the specified name is required, otherwise false.

isUnregistered

public boolean isUnregistered()
Indicates whether this tag has a syntax that does not match any of the registered tag types.

The only requirement of an unregistered tag type is that it starts with '<' and there is a closing '>' character at some position after it in the source document.

The absence or presence of a '/' character after the initial '<' determines whether an unregistered tag is respectively a StartTag with a type of StartTagType.UNREGISTERED or an EndTag with a type of EndTagType.UNREGISTERED.

There are no restrictions on the characters that might appear between these delimiters, including other '<' characters. This may result in a '>' character that is identified as the closing delimiter of two separate tags, one an unregistered tag, and the other a tag of any type that begins in the middle of the unregistered tag. As explained below, unregistered tags are usually only found when specifically looking for them, so it is up to the user to detect and deal with any such nonsensical results.

Unregistered tags are only returned by the Source.getTagAt(int pos) method, named search methods, where the specified name matches the first characters inside the tag, and by tag type search methods, where the specified tagType is either StartTagType.UNREGISTERED or EndTagType.UNREGISTERED.

Open tag searches and other searches always ignore unregistered tags, although every discovery of an unregistered tag is logged by the parser.

The logic behind this design is that unregistered tag types are usually the result of a '<' character in the text that was mistakenly left unencoded, or a less-than operator inside a script, or some other occurrence which is of no interest to the user. By returning unregistered tags in named and tag type search methods, the library allows the user to specifically search for tags with a certain syntax that does not match any existing TagType. This expediency feature avoids the need for the user to create a custom tag type to define the syntax before searching for these tags. By not returning unregistered tags in the less specific search methods, it is providing only the information that most users are interested in.

Overrides:
isUnregistered in interface Tag
Returns:
true if this tag has a syntax that does not match any of the registered tag types, otherwise false.

regenerateHTML

public String regenerateHTML()

Deprecated. Use tidy() instead.

Regenerates the HTML text of this end tag.

This method has been deprecated as of version 2.2 and replaced with the exactly equivalent tidy() method.

Overrides:
regenerateHTML in interface Tag
Returns:
the regenerated HTML text of this end tag.

tidy

public String tidy()
Overrides:
tidy in interface Tag
Returns:
an XML representation of this end tag.