|
Text.XML.HXT.Arrow.XmlState | Portability | portable | Stability | stable | Maintainer | Uwe Schmidt (uwe@fh-wedel.de) |
|
|
|
|
|
Description |
the interface for the basic state maipulation functions
|
|
Synopsis |
|
|
|
|
Data Types
|
|
|
state datatype consists of a system state and a user state
the user state is not fixed
|
|
|
|
predefined system state data type with all components for the
system functions, like trace, error handling, ...
|
|
|
|
The arrow type for stateful arrows
|
|
|
The arrow for stateful arrows with no user defined state
|
|
|
|
|
|
User State Manipulation
|
|
|
read the user defined part of the state
|
|
|
set the user defined part of the state
|
|
|
change the user defined part of the state
|
|
|
extend user state
Run an arrow with an extended user state component, The old component
is stored together with a new one in a pair, the arrow is executed with this
extended state, and the augmented state component is removed form the state
when the arrow has finished its execution
|
|
|
change the type of user state
This conversion is useful, when running a state arrow with another
structure of the user state, e.g. with () when executing some IO arrows
|
|
|
|
Run IO State arrows
|
|
|
apply an IOSArrow to an empty root node with initialState () as initial state
the main entry point for running a state arrow with IO
when running runX f an empty XML root node is applied to f.
usually f will start with a constant arrow (ignoring the input), e.g. a Text.XML.HXT.Arrow.ReadDocument.readDocument arrow.
for usage see examples with Text.XML.HXT.Arrow.WriteDocument.writeDocument
if input has to be feed into the arrow use runIOSLA like in runIOSLA f emptyX inputDoc
|
|
Global System State Configuration and Access
|
|
|
|
|
store a string in global state under a given attribute name
|
|
|
remove an entry in global state, arrow input remains unchanged
|
|
|
read an attribute value from global state
|
|
|
read all attributes from global state
|
|
|
|
|
store an int value in global state
|
|
|
read an int value from global state
getSysAttrInt 0 myIntAttr
|
|
|
|
Error Handling
|
|
|
reset global error variable
|
|
|
set global error variable
|
|
|
read current global error status
|
|
|
raise the global error status level to that of the input tree
|
|
|
set the error message handler and the flag for collecting the errors
|
|
|
the default error message handler: error output to stderr
|
|
|
error message handler for collecting errors
|
|
|
error message handler for output to stderr and collecting
|
|
|
error message handler for ignoring errors
|
|
|
if error messages are collected by the error handler for
processing these messages by the calling application,
this arrow reads the stored messages and clears the error message store
|
|
|
filter error messages from input trees and issue errors
|
|
|
generate a warnig message
|
|
|
generate an error message
|
|
|
generate a fatal error message, e.g. document not found
|
|
|
Default exception handler: issue a fatal error message and fail.
The parameter can be used to specify where the error occured
|
|
|
add the error level and the module where the error occured
to the attributes of a document root node and remove the children when level is greater or equal to c_err.
called by setDocumentStatusFromSystemState when the system state indicates an error
|
|
|
check whether the error level attribute in the system state
is set to error, in this case the children of the document root are
removed and the module name where the error occured and the error level are added as attributes with setDocumentStatus
else nothing is changed
|
|
|
check whether tree is a document root and the status attribute has a value less than c_err
|
|
Tracing
|
|
|
set the global trace level
|
|
|
read the global trace level
|
|
|
run an arrow with a given trace level, the old trace level is restored after the arrow execution
|
|
|
set the global trace command. This command does the trace output
|
|
|
acces the command for trace output
|
|
|
apply a trace arrow and issue message to stderr
|
|
|
issue a string message as trace
|
|
|
trace the current value transfered in a sequence of arrows.
The value is formated by a string conversion function. This is a substitute for
the old and less general traceString function
|
|
|
an old alias for traceValue
|
|
|
issue the source representation of a document if trace level >= 3
for better readability the source is formated with indentDoc
|
|
|
issue the tree representation of a document if trace level >= 4
|
|
|
trace a main computation step
issue a message when trace level >= 1, issue document source if level >= 3, issue tree when level is >= 4
|
|
Document Base
|
|
|
set the base URI of a document, used e.g. for reading includes, e.g. external entities,
the input must be an absolute URI
|
|
|
read the base URI from the globale state
|
|
|
change the base URI with a possibly relative URI, can be used for
evaluating the xml:base attribute. Returns the new absolute base URI.
Fails, if input is not parsable with parseURIReference
see also: setBaseURI, mkAbsURI
|
|
|
set the default base URI, if parameter is null, the system base ( file:///<cwd>/ ) is used,
else the parameter, must be called before any document is read
|
|
|
get the default base URI
|
|
|
remember base uri, run an arrow and restore the base URI, used with external entity substitution
|
|
URI Manipulation
|
|
|
compute the absolut URI for a given URI and a base URI
|
|
|
arrow variant of expandURIString, fails if expandURIString returns Nothing
|
|
|
arrow for expanding an input URI into an absolute URI using global base URI, fails if input is not a legal URI
|
|
|
arrow for computing the fragment component of an URI, fails if input is not a legal URI
|
|
|
arrow for computing the path component of an URI, fails if input is not a legal URI
|
|
|
arrow for selecting the port number of the URI without leading ':', fails if input is not a legal URI
|
|
|
arrow for computing the query component of an URI, fails if input is not a legal URI
|
|
|
arrow for selecting the registered name (host) of the URI, fails if input is not a legal URI
|
|
|
arrow for selecting the scheme (protocol) of the URI, fails if input is not a legal URI.
See Network.URI for URI components
|
|
|
arrow for selecting the user info of the URI without trailing '@', fails if input is not a legal URI
|
|
Mime Type Handling
|
|
|
read the system mimetype table
|
|
|
set the table mapping of file extensions to mime types in the system state
Default table is defined in Text.XML.HXT.DOM.MimeTypeDefaults.
This table is used when reading loacl files, (file: protocol) to determine the mime type
|
|
|
set the table mapping of file extensions to mime types by an external config file
The config file must follow the conventions of etcmime.types on a debian linux system,
that means all empty lines and all lines starting with a # are ignored. The other lines
must consist of a mime type followed by a possible empty list of extensions.
The list of extenstions and mime types overwrites the default list in the system state
of the IOStateArrow
|
|
|
|
|
|
|
|
|
|
|
withSysAttr key value : store an arbitarty key value pair in system state
|
|
|
withCanonicalize yes/no : read option, canonicalize document, default is yes
|
|
|
Configure compression and decompression for binary serialization/deserialization.
First component is the compression function applied after serialization,
second the decompression applied before deserialization.
|
|
|
withCheckNamespaces yes/no: read option, check namespaces, default is no
|
|
|
withDefaultBaseURI URI , input option, set the default base URI
This option can be useful when parsing documents from stdin or contained in a string, and interpreting
relative URIs within the document
|
|
|
withEncodingErrors yes/no : input option, ignore all encoding errors, default is no
|
|
|
withErrors yes/no : system option for suppressing error messages, default is no
|
|
|
Force a given mime type for all file contents.
The mime type for file access will then not be computed by looking into a mime.types file
|
|
|
withIgnoreNoneXmlContents yes/no : input option, ignore document contents of none XML/HTML documents.
This option can be useful for implementing crawler like applications, e.g. an URL checker.
In those cases net traffic can be reduced.
|
|
|
withIndent yes/no : output option, indent document before output, default is no
|
|
|
withInputEncoding encodingName : input option
Set default document encoding (utf8, isoLatin1, usAscii, iso8859_2, ... , iso8859_16, ...).
Only XML, HTML and text documents are decoded,
default decoding for XML/HTML is utf8, for text iso latin1 (no decoding).
|
|
|
|
|
|
|
withMimeTypeFile filename : input option,
set the mime type table for file: documents by given file.
The format of this config file must be in the syntax of a debian linux "mime.types" config file
|
|
|
|
|
|
|
withOutputEncoding encoding , output option,
default is the default input encoding or utf8, if input encoding is not set
|
|
|
withOutputXML : output option, default writing
Default is writing XML: quote special XML chars >,<,",',& where neccessary,
add XML processing instruction
and encode document with respect to withOutputEncoding
|
|
|
Write XHTML: quote all special XML chars, use HTML entity refs or char refs for none ASCII chars
|
|
|
Write XML: quote only special XML chars, don't substitute chars by HTML entities, and don't generate empty elements for HTML elements,
which may contain any contents, e.g. script src=...></script>@ instead of @<script src=... /
|
|
|
suppreses all char and entitiy substitution
|
|
|
withParseByMimeType yes/no : read option, select the parser by the mime type of the document
(pulled out of the HTTP header).
When the mime type is set to "text/html"
the configured HTML parser is taken, when it's set to
"text/xml" or "text/xhtml" the configured XML parser is taken.
If the mime type is something else, no further processing is performed,
the contents is given back to the application in form of a single text node.
If the default document encoding is set to isoLatin1, this even enables processing
of arbitray binary data.
|
|
|
withParseHTML yes/no: read option, use HTML parser, default is no (use XML parser)
|
|
|
withPreserveComment yes/no : read option, preserve comments during canonicalization, default is no
|
|
|
withProxy "host:port" : input option, configure a proxy for HTTP access, e.g. www-cache:3128
|
|
|
withRedirect yes/no : input option, automatically follow redirected URIs, default is yes
|
|
|
withRemoveWS yes/no : read and write option, remove all whitespace, used for document indentation, default is no
|
|
|
|
|
|
|
withStrictInput yes/no : input option, input of file and HTTP contents is read eagerly, default is no
|
|
|
|
|
withTace level : system option, set the trace level, (0..4)
|
|
|
withValidate yes/no: read option, validate document againsd DTD, default is yes
|
|
|
withWarnings yes/no : system option, issue warnings during reading, HTML parsing and processing,
default is yes
|
|
Produced by Haddock version 2.6.1 |