org.w3c.tidy

Class Configuration

Implemented Interfaces:
Serializable

public class Configuration
extends java.lang.Object
implements Serializable

Read configuration file and manage configuration properties. Configuration files associate a property name with a value. The format is that of a Java .properties file.
Version:
$Revision: 807 $ ($Author: fgiust $)
Authors:
Dave Raggett dsr@w3.org
Andy Quick ac.quick@sympatico.ca (translation to Java)
Fabrizio Giustina

Field Summary

static int
ASCII
Deprecated.
static int
BIG5
Deprecated.
static int
DOCTYPE_AUTO
treatment of doctype: auto.
static int
DOCTYPE_LOOSE
treatment of doctype: loose.
static int
DOCTYPE_OMIT
treatment of doctype: omit.
static int
DOCTYPE_STRICT
treatment of doctype: strict.
static int
DOCTYPE_USER
treatment of doctype: user.
static int
ISO2022
Deprecated.
static int
KEEP_FIRST
Keep first duplicate attribute.
static int
KEEP_LAST
Keep last duplicate attribute.
static int
LATIN1
Deprecated.
static int
MACROMAN
Deprecated.
static int
RAW
Deprecated. use Tidy.setRawOut(true) for raw output
static int
SHIFTJIS
Deprecated.
static int
UTF16
Deprecated.
static int
UTF16BE
Deprecated.
static int
UTF16LE
Deprecated.
static int
UTF8
Deprecated.
static int
WIN1252
Deprecated.
protected String
altText
default text for alt attribute.
protected boolean
asciiChars
convert quotes and dashes to nearest ASCII char.
protected boolean
bodyOnly
output BODY content only.
protected boolean
breakBeforeBR
o/p newline before br or not?
protected boolean
burstSlides
create slides on each h2 element.
protected String
cssPrefix
CSS class naming for -clean option.
protected int
definedTags
track what types of tags user has defined to eliminate unnecessary searches.
protected int
docTypeMode
see doctype property.
protected String
docTypeStr
user specified doctype.
protected boolean
dropEmptyParas
discard empty p elements.
protected boolean
dropFontTags
discard presentation tags.
protected boolean
dropProprietaryAttributes
discard proprietary attributes.
protected int
duplicateAttrs
Keep first or last duplicate attribute.
protected boolean
emacs
if true format error output for GNU Emacs.
protected boolean
encloseBlockText
if yes text in blocks is wrapped in p's.
protected boolean
encloseBodyText
if yes text at body is wrapped in p's.
protected String
errfile
file name to write errors to.
protected boolean
escapeCdata
replace CDATA sections with escaped text.
protected boolean
fixBackslash
fix URLs by replacing \ with /.
protected boolean
fixComments
fix comments with adjacent hyphens.
protected boolean
fixUri
properly escape URLs.
protected boolean
forceOutput
output document even if errors were found.
protected boolean
hideComments
hides all (real) comments in output.
protected boolean
hideEndTags
suppress optional end tags.
protected boolean
htmlOut
output plain-old HTML, even for XHTML input.
protected boolean
indentAttributes
newline+indent before each attribute.
protected boolean
indentCdata
indent CDATA sections.
protected boolean
indentContent
indent content of appropriate tags.
protected boolean
joinClasses
join multiple class attributes.
protected boolean
joinStyles
join multiple style attributes.
protected boolean
keepFileTimes
if yes last modied time is preserved.
protected String
language
RJ language property.
protected boolean
literalAttribs
if true attributes may use newlines.
protected boolean
logicalEmphasis
replace i by em and b by strong.
protected boolean
lowerLiterals
folds known attribute values to lower case.
protected boolean
makeBare
Make bare HTML: remove Microsoft cruft.
protected boolean
makeClean
remove presentational clutter.
protected boolean
ncr
allow numeric character references.
protected char[]
newline
bytes for the newline marker.
protected boolean
numEntities
use numeric entities.
protected boolean
onlyErrors
if true normal output is suppressed.
protected boolean
quiet
no 'Parsing X', guessed DTD or summary.
protected boolean
quoteAmpersand
output naked ampersand as &.
protected boolean
quoteMarks
output " marks as ".
protected boolean
quoteNbsp
output non-breaking space as entity.
protected boolean
rawOut
Avoid mapping values > 127 to entities.
protected boolean
replaceColor
replace hex color attribute values with names.
protected String
replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.
protected Report
report
Report instance.
protected int
showErrors
number of errors to put out.
protected boolean
showWarnings
however errors are always shown.
protected String
slidestyle
Deprecated. does nothing
protected boolean
smartIndent
does text/block level content effect indentation.
protected int
spaces
default indentation.
protected int
tabsize
default tab size (8).
protected boolean
tidyMark
add meta element indicating tidied doc.
protected boolean
trimEmpty
trim empty elements.
protected TagTable
tt
TagTable associated with this Configuration.
protected boolean
upperCaseAttrs
output attributes in upper not lower case.
protected boolean
upperCaseTags
output tags in upper not lower case.
protected boolean
word2000
draconian cleaning for Word2000.
protected boolean
wrapAsp
wrap within ASP pseudo elements.
protected boolean
wrapAttVals
wrap within attribute values.
protected boolean
wrapJste
wrap within JSTE pseudo elements.
protected boolean
wrapPhp
wrap within PHP pseudo elements.
protected boolean
wrapScriptlets
wrap within JavaScript string literals.
protected boolean
wrapSection
wrap within CDATA section tags.
protected int
wraplen
default wrap margin (68).
protected boolean
writeback
if true then output tidied markup.
protected boolean
xHTML
output extensible HTML.
protected boolean
xmlOut
create output as XML.
protected boolean
xmlPIs
If set to yes PIs must end with ?>.
protected boolean
xmlPi
add <?xml?> for XML docs.
protected boolean
xmlSpace
if set to yes adds xml:space attr as needed.
protected boolean
xmlTags
treat input as XML.

Constructor Summary

Configuration(Report report)
Instantiates a new Configuration.

Method Summary

void
addProps(Properties p)
adds configuration Properties.
void
adjust()
Ensure that config is self consistent.
protected String
convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.
protected String
getInCharEncodingName()
Getter for inCharEncodingName.
protected String
getOutCharEncodingName()
Getter for outCharEncodingName.
static boolean
isKnownOption(String name)
Is the given String a valid configuration flag?
void
parseFile(String filename)
Parses a property file.
protected void
setInCharEncoding(int encoding)
Deprecated. use setInCharEncodingName(String)
protected void
setInCharEncodingName(String encoding)
Setter for inCharEncodingName.
protected void
setInOutEncodingName(String encoding)
Setter for inOutCharEncodingName.
protected void
setOutCharEncoding(int encoding)
Deprecated. use setOutCharEncodingName(String)
protected void
setOutCharEncodingName(String encoding)
Setter for outCharEncodingName.

Field Details

ASCII

public static final int ASCII

Deprecated.

character encoding = ASCII.
Field Value:
1

BIG5

public static final int BIG5

Deprecated.

character encoding = BIG5.
Field Value:
10

DOCTYPE_AUTO

public static final int DOCTYPE_AUTO
treatment of doctype: auto.
Field Value:
1

DOCTYPE_LOOSE

public static final int DOCTYPE_LOOSE
treatment of doctype: loose.
Field Value:
3

DOCTYPE_OMIT

public static final int DOCTYPE_OMIT
treatment of doctype: omit.
Field Value:
0

DOCTYPE_STRICT

public static final int DOCTYPE_STRICT
treatment of doctype: strict.
Field Value:
2

DOCTYPE_USER

public static final int DOCTYPE_USER
treatment of doctype: user.
Field Value:
4

ISO2022

public static final int ISO2022

Deprecated.

character encoding = ISO2022.
Field Value:
4

KEEP_FIRST

public static final int KEEP_FIRST
Keep first duplicate attribute.
Field Value:
1

KEEP_LAST

public static final int KEEP_LAST
Keep last duplicate attribute.
Field Value:
0

LATIN1

public static final int LATIN1

Deprecated.

character encoding = LATIN1.
Field Value:
2

MACROMAN

public static final int MACROMAN

Deprecated.

character encoding = MACROMAN.
Field Value:
5

RAW

public static final int RAW

Deprecated. use Tidy.setRawOut(true) for raw output

character encoding = RAW.
Field Value:
0

SHIFTJIS

public static final int SHIFTJIS

Deprecated.

character encoding = SHIFTJIS.
Field Value:
11

UTF16

public static final int UTF16

Deprecated.

character encoding = UTF16.
Field Value:
8

UTF16BE

public static final int UTF16BE

Deprecated.

character encoding = UTF16BE.
Field Value:
7

UTF16LE

public static final int UTF16LE

Deprecated.

character encoding = UTF16LE.
Field Value:
6

UTF8

public static final int UTF8

Deprecated.

character encoding = UTF8.
Field Value:
3

WIN1252

public static final int WIN1252

Deprecated.

character encoding = WIN1252.
Field Value:
9

altText

protected String altText
default text for alt attribute.

asciiChars

protected boolean asciiChars
convert quotes and dashes to nearest ASCII char.

bodyOnly

protected boolean bodyOnly
output BODY content only.

breakBeforeBR

protected boolean breakBeforeBR
o/p newline before br or not?

burstSlides

protected boolean burstSlides
create slides on each h2 element.

cssPrefix

protected String cssPrefix
CSS class naming for -clean option.

definedTags

protected int definedTags
track what types of tags user has defined to eliminate unnecessary searches.

docTypeMode

protected int docTypeMode
see doctype property.

docTypeStr

protected String docTypeStr
user specified doctype.

dropEmptyParas

protected boolean dropEmptyParas
discard empty p elements.

dropFontTags

protected boolean dropFontTags
discard presentation tags.

dropProprietaryAttributes

protected boolean dropProprietaryAttributes
discard proprietary attributes.

duplicateAttrs

protected int duplicateAttrs
Keep first or last duplicate attribute.

emacs

protected boolean emacs
if true format error output for GNU Emacs.

encloseBlockText

protected boolean encloseBlockText
if yes text in blocks is wrapped in p's.

encloseBodyText

protected boolean encloseBodyText
if yes text at body is wrapped in p's.

errfile

protected String errfile
file name to write errors to.

escapeCdata

protected boolean escapeCdata
replace CDATA sections with escaped text.

fixBackslash

protected boolean fixBackslash
fix URLs by replacing \ with /.

fixComments

protected boolean fixComments
fix comments with adjacent hyphens.

fixUri

protected boolean fixUri
properly escape URLs.

forceOutput

protected boolean forceOutput
output document even if errors were found.

hideComments

protected boolean hideComments
hides all (real) comments in output.

hideEndTags

protected boolean hideEndTags
suppress optional end tags.

htmlOut

protected boolean htmlOut
output plain-old HTML, even for XHTML input. Yes means set explicitly.

indentAttributes

protected boolean indentAttributes
newline+indent before each attribute.

indentCdata

protected boolean indentCdata
indent CDATA sections.

indentContent

protected boolean indentContent
indent content of appropriate tags.

joinClasses

protected boolean joinClasses
join multiple class attributes.

joinStyles

protected boolean joinStyles
join multiple style attributes.

keepFileTimes

protected boolean keepFileTimes
if yes last modied time is preserved.

language

protected String language
RJ language property.

literalAttribs

protected boolean literalAttribs
if true attributes may use newlines.

logicalEmphasis

protected boolean logicalEmphasis
replace i by em and b by strong.

lowerLiterals

protected boolean lowerLiterals
folds known attribute values to lower case.

makeBare

protected boolean makeBare
Make bare HTML: remove Microsoft cruft.

makeClean

protected boolean makeClean
remove presentational clutter.

ncr

protected boolean ncr
allow numeric character references.

newline

protected char[] newline
bytes for the newline marker.

numEntities

protected boolean numEntities
use numeric entities.

onlyErrors

protected boolean onlyErrors
if true normal output is suppressed.

quiet

protected boolean quiet
no 'Parsing X', guessed DTD or summary.

quoteAmpersand

protected boolean quoteAmpersand
output naked ampersand as &.

quoteMarks

protected boolean quoteMarks
output " marks as ".

quoteNbsp

protected boolean quoteNbsp
output non-breaking space as entity.

rawOut

protected boolean rawOut
Avoid mapping values > 127 to entities.

replaceColor

protected boolean replaceColor
replace hex color attribute values with names.

replacementCharEncoding

protected String replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.

report

protected Report report
Report instance. Used for messages.

showErrors

protected int showErrors
number of errors to put out.

showWarnings

protected boolean showWarnings
however errors are always shown.

slidestyle

protected String slidestyle

Deprecated. does nothing

style sheet for slides.

smartIndent

protected boolean smartIndent
does text/block level content effect indentation.

spaces

protected int spaces
default indentation.

tabsize

protected int tabsize
default tab size (8).

tidyMark

protected boolean tidyMark
add meta element indicating tidied doc.

trimEmpty

protected boolean trimEmpty
trim empty elements.

tt

protected TagTable tt
TagTable associated with this Configuration.

upperCaseAttrs

protected boolean upperCaseAttrs
output attributes in upper not lower case.

upperCaseTags

protected boolean upperCaseTags
output tags in upper not lower case.

word2000

protected boolean word2000
draconian cleaning for Word2000.

wrapAsp

protected boolean wrapAsp
wrap within ASP pseudo elements.

wrapAttVals

protected boolean wrapAttVals
wrap within attribute values.

wrapJste

protected boolean wrapJste
wrap within JSTE pseudo elements.

wrapPhp

protected boolean wrapPhp
wrap within PHP pseudo elements.

wrapScriptlets

protected boolean wrapScriptlets
wrap within JavaScript string literals.

wrapSection

protected boolean wrapSection
wrap within CDATA section tags.

wraplen

protected int wraplen
default wrap margin (68).

writeback

protected boolean writeback
if true then output tidied markup.

xHTML

protected boolean xHTML
output extensible HTML.

xmlOut

protected boolean xmlOut
create output as XML.

xmlPIs

protected boolean xmlPIs
If set to yes PIs must end with ?>.

xmlPi

protected boolean xmlPi
add <?xml?> for XML docs.

xmlSpace

protected boolean xmlSpace
if set to yes adds xml:space attr as needed.

xmlTags

protected boolean xmlTags
treat input as XML.

Constructor Details

Configuration

protected Configuration(Report report)
Instantiates a new Configuration. This method should be called by Tidy only.
Parameters:
report - Report instance

Method Details

addProps

public void addProps(Properties p)
adds configuration Properties.
Parameters:
p - Properties

adjust

public void adjust()
Ensure that config is self consistent.

convertCharEncoding

protected String convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.
Parameters:
code - encoding code
Returns:
encoding name

getInCharEncodingName

protected String getInCharEncodingName()
Getter for inCharEncodingName.
Returns:
Returns the inCharEncodingName.

getOutCharEncodingName

protected String getOutCharEncodingName()
Getter for outCharEncodingName.
Returns:
Returns the outCharEncodingName.

isKnownOption

public static boolean isKnownOption(String name)
Is the given String a valid configuration flag?
Parameters:
name - configuration parameter name
Returns:
true if the given String is a valid config option

parseFile

public void parseFile(String filename)
Parses a property file.
Parameters:
filename - file name

setInCharEncoding

protected void setInCharEncoding(int encoding)

Deprecated. use setInCharEncodingName(String)

Setter for inCharEncoding.
Parameters:
encoding - The inCharEncoding to set.

setInCharEncodingName

protected void setInCharEncodingName(String encoding)
Setter for inCharEncodingName.
Parameters:
encoding - The inCharEncodingName to set.

setInOutEncodingName

protected void setInOutEncodingName(String encoding)
Setter for inOutCharEncodingName.
Parameters:
encoding - The CharEncodingName to set.

setOutCharEncoding

protected void setOutCharEncoding(int encoding)

Deprecated. use setOutCharEncodingName(String)

Setter for outCharEncoding.
Parameters:
encoding - The outCharEncoding to set.

setOutCharEncodingName

protected void setOutCharEncodingName(String encoding)
Setter for outCharEncodingName.
Parameters:
encoding - The outCharEncodingName to set.