au.id.jericho.lib.html

Class Attributes

Implemented Interfaces:
CharSequence, Comparable, List

public final class Attributes
extends au.id.jericho.lib.html.nodoc.SequentialListSegment

Represents the list of Attribute objects present within a particular StartTag.

This segment starts at the end of the start tag's name and ends at the end of the last attribute.

The attributes in this list are a representation of those found in the source document and are not modifiable. The AttributesOutputSegment class provides the means to add, delete or modify attributes and their values for inclusion in an OutputDocument.

If too many syntax errors are encountered while parsing a start tag's attributes, the parser rejects the entire start tag and generates a log entry. The threshold for the number of errors allowed can be set using the setDefaultMaxErrorCount(int) static method.

Obtained using the StartTag.getAttributes() method, or explicitly using the Source.parseAttributes(int pos, int maxEnd) method.

It is common for instances of this class to contain no attributes.

See also the XML 1.0 specification for attributes.

Note that before version 2.0 the segment ended just before the tag's closing delimiter instead of at the end of the last attribute.

See Also:
StartTag, Attribute

Method Summary

static String
generateHTML(Map attributesMap)
Returns the contents of the specified attributes map as HTML attribute name/value pairs.
Attribute
get(String name)
Returns the Attribute with the specified name (case insensitive).
int
getCount()
Returns the number of attributes.
String
getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
static int
getDefaultMaxErrorCount()
Returns the default maximum error count allowed when parsing attributes.
List
getList()
Deprecated. Use the Attributes object itself instead.
String
getValue(String name)
Returns the decoded value of the attribute with the specified name (case insensitive).
Iterator
iterator()
Returns an iterator over the Attribute objects in this list in order of appearance.
ListIterator
listIterator(int index)
Returns a list iterator of the Attribute objects in this list in order of appearance, starting at the specified position in the list.
Map
populateMap(Map attributesMap, boolean convertNamesToLowerCase)
Populates the specified Map with the name/value pairs from these attributes.
static void
setDefaultMaxErrorCount(int value)
Sets the default maximum error count allowed when parsing attributes.

Methods inherited from class au.id.jericho.lib.html.nodoc.SequentialListSegment

add, add, addAll, addAll, clear, contains, containsAll, get, getCount, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray

Methods inherited from class au.id.jericho.lib.html.Segment

charAt, compareTo, encloses, encloses, equals, extractText, extractText, findAllCharacterReferences, findAllComments, findAllElements, findAllElements, findAllElements, findAllStartTags, findAllStartTags, findAllStartTags, findAllTags, findAllTags, findFormControls, findFormFields, findWords, getBegin, getChildElements, getDebugInfo, getEnd, getSourceText, getSourceTextNoWhitespace, hashCode, ignoreWhenParsing, isComment, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString

Method Details

generateHTML

public static String generateHTML(Map attributesMap)
Returns the contents of the specified attributes map as HTML attribute name/value pairs.

Each attribute (including the first) is preceded by a single space, and all values are encoded and enclosed in double quotes.

The map keys must be of type String and values must be objects that implement the CharSequence interface.

A null value represents an attribute with no value.

Parameters:
attributesMap - a map containing attribute name/value pairs.
Returns:
the contents of the specified attributes map as HTML attribute name/value pairs.
See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)

get

public Attribute get(String name)
Returns the Attribute with the specified name (case insensitive).

If more than one attribute exists with the specified name (which is illegal HTML), the first is returned.

Parameters:
name - the name of the attribute to get.
Returns:
the attribute with the specified name, or null if no attribute with the specified name exists.
See Also:
getValue(String name)

getCount

public int getCount()
Returns the number of attributes.

This is equivalent to calling the size() method specified in the List interface.

Overrides:
getCount in interface au.id.jericho.lib.html.nodoc.SequentialListSegment
Returns:
the number of attributes.

getDebugInfo

public String getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
Overrides:
getDebugInfo in interface Segment
Returns:
a string representation of this object useful for debugging purposes.

getDefaultMaxErrorCount

public static int getDefaultMaxErrorCount()
Returns the default maximum error count allowed when parsing attributes.

The system default value is 2.

When searching for start tags, the parser can find the end of the start tag only by parsing the attributes, as it is valid HTML for attribute values to contain '>' characters (see the HTML 4.01 specification section 5.3.2).

If the source text being parsed does not follow the syntax of an attribute list at all, the parser assumes that the text which was originally identified as the beginning of of a start tag is in fact some other text, such as an invalid '<' character in the middle of some text, or part of a script element. In this case the entire start tag is rejected.

On the other hand, it is quite common for attributes to contain minor syntactical errors, such as an invalid character in an attribute name, or a couple of special characters in server tags that otherwise contain only attributes. For this reason the parser allows a certain number of minor errors to occur while parsing an attribute list before the entire start tag or attribute list is rejected. This property indicates the number of minor errors allowed.

Major syntactical errors cause the start tag or attribute list to be rejected immediately, regardless of the maximum error count setting.

Some errors are considered too minor to count at all (ignorable), such as missing whitespace between the end of a quoted attribute value and the start of the next attribute name.

The classification of particular syntax errors in attribute lists into major, minor, and ignorable is not part of the specification and may change in future versions.

To track errors as they occur, use the Source.setLogWriter(Writer writer) method to set the destination of the error log.

The value of this property is set using the setDefaultMaxErrorCount(int) method.

Returns:
the default maximum error count allowed when parsing attributes.
See Also:
Source.parseAttributes(int pos, int maxEnd, int maxErrorCount)

getList

public List getList()

Deprecated. Use the Attributes object itself instead.

Returns this instance.

This method has been deprecated as of version 2.0 as the Attributes class now implements the List interface, so the instance itself can be used instead.

Returns:
this instance.

getValue

public String getValue(String name)
Returns the decoded value of the attribute with the specified name (case insensitive).

Returns null if no attribute with the specified name exists or the attribute has no value.

This is equivalent to get(name).getValue(), except that it returns null if no attribute with the specified name exists instead of throwing a NullPointerException.

Parameters:
name - the name of the attribute to get.
Returns:
the decoded value of the attribute with the specified name, or null if the attribute does not exist or has no value.

iterator

public Iterator iterator()
Returns an iterator over the Attribute objects in this list in order of appearance.
Overrides:
iterator in interface au.id.jericho.lib.html.nodoc.SequentialListSegment
Returns:
an iterator over the Attribute objects in this list in order of appearance.

listIterator

public ListIterator listIterator(int index)
Returns a list iterator of the Attribute objects in this list in order of appearance, starting at the specified position in the list.

The specified index indicates the first item that would be returned by an initial call to the next() method. An initial call to the previous() method would return the item with the specified index minus one.

IMPLEMENTATION NOTE: For efficiency reasons this method does not return an immutable list iterator. Calling any of the add(Object), remove() or set(Object) methods on the returned ListIterator does not throw an exception but could result in unexpected behaviour.

Overrides:
listIterator in interface au.id.jericho.lib.html.nodoc.SequentialListSegment
Parameters:
index - the index of the first item to be returned from the list iterator (by a call to the next() method).
Returns:
a list iterator of the items in this list (in proper sequence), starting at the specified position in the list.

populateMap

public Map populateMap(Map attributesMap,
                       boolean convertNamesToLowerCase)
Parameters:
attributesMap - the map to populate, must not be null.
convertNamesToLowerCase - specifies whether all attribute names are converted to lower case in the map.
Returns:
the same map specified as the argument to the attributesMap parameter, populated with the name/value pairs from these attributes.
See Also:
generateHTML(Map attributesMap)

setDefaultMaxErrorCount

public static void setDefaultMaxErrorCount(int value)
Parameters:
value - the default maximum error count allowed when parsing attributes.