au.id.jericho.lib.html
Class Config.CompatibilityMode
java.lang.Object
au.id.jericho.lib.html.Config.CompatibilityMode
- Config
public static final class Config.CompatibilityMode
extends java.lang.Object
Represents a set of configuration parameters that relate to
user agent compatibility issues.
The predefined compatibility modes
IE
,
MOZILLA
,
OPERA
and
XHTML
provide an easy means of
ensuring the library interprets the markup in a way consistent with some of the most commonly used browsers,
at least in relation to the behaviour described by the properties in this class.
The properties of any
CompatibilityMode
object can be modified individually, including those in
the predefined instances as well as newly constructed instances.
Take note however that modifying the properties of the predefined instances has a global affect.
The currently active compatibility mode is stored in the static
Config.CurrentCompatibilityMode
property.
CODE_POINTS_ALL
public static final int CODE_POINTS_ALL
Indicates the recognition of all unicode code points.
This value is used in properties which specify a maximum unicode code point to be recognised by the parser.
getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
, getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
, getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
CODE_POINTS_NONE
public static final int CODE_POINTS_NONE
Indicates the recognition of no unicode code points.
This value is used in properties which specify a maximum unicode code point to be recognised by the parser.
getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
, getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
, getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
CompatibilityMode
public CompatibilityMode(String name)
Constructs a new
CompatibilityMode
with the given
name.
All properties in the new instance are initially assigned their default values, which are the same as the strict
rules of the
XHTML
compatibility mode.
name
- the name of the new compatibility mode
getDebugInfo
public String getDebugInfo()
Returns a string representation of this object useful for debugging purposes.
- a string representation of this object useful for debugging purposes.
getName
public String getName()
Returns the name of this compatibility mode.
- the name of this compatibility mode.
getUnterminatedCharacterEntityReferenceMaxCodePoint
public int getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an
unterminated
character entity reference which is to be recognised in the specified context.
For example, if
getUnterminatedCharacterEntityReferenceMaxCodePoint(true)
has the value
0xFF
(U+00FF)
in the
current compatibility mode, then:
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode(">",true)
returns ">
".
The string is recognised as the character entity reference CharacterEntityReference._gt >
despite the fact that it is unterminated,
because its unicode code point U+003E is below the maximum of U+00FF set by this property.
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode("&euro",true)
returns "&euro
".
The string is not recognised as the character entity reference CharacterEntityReference._euro €
because it is unterminated
and its unicode code point U+20AC is above the maximum of U+00FF set by this property.
See the documentation of the
Attribute.getValue()
method for further discussion.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.
setUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
getUnterminatedDecimalCharacterReferenceMaxCodePoint
public int getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an
unterminated
decimal character reference which is to be recognised in the specified context.
For example, if
getUnterminatedDecimalCharacterReferenceMaxCodePoint(true)
had the hypothetical value
0xFF
(U+00FF)
in the
current compatibility mode, then:
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode("&.62",true)
returns ">
".
The string is recognised as the numeric character reference >
despite the fact that it is unterminated,
because its unicode code point U+003E is below the maximum of U+00FF set by this property.
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode("&.8364",true)
returns "€
".
The string is not recognised as the numeric character reference €
because it is unterminated
and its unicode code point U+20AC is above the maximum of U+00FF set by this property.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.
setUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
getUnterminatedHexadecimalCharacterReferenceMaxCodePoint
public int getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an
unterminated
hexadecimal character reference which is to be recognised in the specified context.
For example, if
getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(true)
had the hypothetical value
0xFF
(U+00FF)
in the
current compatibility mode, then:
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode("&.x3e",true)
returns ">
".
The string is recognised as the numeric character reference >
despite the fact that it is unterminated,
because its unicode code point U+003E is below the maximum of U+00FF set by this property.
CharacterReference.decode(CharSequence,boolean) CharacterReference.decode("&.x20ac",true)
returns "€
".
The string is not recognised as the numeric character reference ac;
because it is unterminated
and its unicode code point U+20AC is above the maximum of U+00FF set by this property.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.
setUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
isFormFieldNameCaseInsensitive
public boolean isFormFieldNameCaseInsensitive()
Indicates whether
form field names are treated as case insensitive.
Microsoft Internet Explorer treats field names as case insensitive,
while Mozilla treats them as case sensitive.
The value of this property in the
current compatibility mode
affects all instances of the
FormFields
class.
It should be set to the desired configuration before any instances of
FormFields
are created.
setFormFieldNameCaseInsensitive
public void setFormFieldNameCaseInsensitive(boolean value)
value
- the new value of the property
setUnterminatedCharacterEntityReferenceMaxCodePoint
public void setUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue,
int maxCodePoint)
Sets the maximum unicode code point of an
unterminated
character entity reference which is to be recognised in the specified context.
See
getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
for the documentation of this property.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.maxCodePoint
- the maximum unicode code point.
setUnterminatedDecimalCharacterReferenceMaxCodePoint
public void setUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue,
int maxCodePoint)
Sets the maximum unicode code point of an
unterminated
decimal character reference which is to be recognised in the specified context.
See
getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
for the documentation of this property.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.maxCodePoint
- the maximum unicode code point.
setUnterminatedHexadecimalCharacterReferenceMaxCodePoint
public void setUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue,
int maxCodePoint)
Sets the maximum unicode code point of an
unterminated
headecimal character reference which is to be recognised in the specified context.
See
getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
for the documentation of this property.
insideAttributeValue
- the context within an HTML document - true
if inside an attribute value or false
if outside an attribute value.maxCodePoint
- the maximum unicode code point.
toString
public String toString()
Returns the
name of this compatibility mode.
- the name of this compatibility mode.