Raptor RDF Parser Toolkit - Raptor API


NAME

libraptor − Raptor RDF parser toolkit library

SYNOPSIS

#include <raptor.h>

raptor_init();
raptor_parser *p=raptor_new_parser("rdfxml");
raptor_set_statement_handler(p,NULL,print_statements);
raptor_uri *file_uri=raptor_new_uri("http://example.org/");
raptor_parse_file(p,file_uri,base_uri);
raptor_parse_uri(p,uri,NULL);
raptor_free_parser(p);
raptor_free_uri(file_uri);
raptor_finish();

cc file.c -lraptor

DESCRIPTION

The Raptor library provides a high-level interface to a set of RDF format parsers, presently RDF/XML and N-Triples. The parsers turn the syntax into a sequence of RDF triples/statements. The RDF/XML parser uses either expat or libxml XML parser for providing the SAX event stream. The library functions are arranged in an object-oriented style with constructors, destructors and method calls. The statements and error messages are delivered via callback functions.

Raptor contains a URI-reference parsing and resolving (not retrieval) class (raptor_uri) sufficient for dealing with URI-references inside RDF. This functionality is modular and can be transparently replaced with another existing and compatible URI implementation.

It also provides a URI-retrieval class (raptor_www) for wrapping existing library such as libcurl, libxml2 or BSD libfetch that provides full or partial retrieval of data from URIs.

Raptor uses Unicode strings for RDF literals and URIs and preserves them throughout the library. It uses the UTF-8 encoding of Unicode at the API for passing in or returning Unicode strings. It is intended that the preservation of Unicode for URIs will support Internationalized Resource Identifiers (IRIs) which are still under development and standardisation.

LIBRARY INITIALISATION AND CLEANUP

raptor_init()

raptor_finish()

Initialise and cleanup the library. These must be called before any raptor_parser or raptor_uri is created or used.

PARSER CONSTRUCTORS

raptor_parser* raptor_new_parser(name)

Create a new raptor parser object for the parser with name name currently either "rdfxml", "turtle" or "rss-tag-soup" for the RSS Tag Soup parser.

raptor_parser* raptor_new_parser_for_content(raptor_uri *uri, const char *mime_type, const unsigned char *buffer, size_t len, const unsigned char *identifier)

Create a new raptor parser object for a syntax identified by URI uri, MIME type mime_type, some initial content buffer of size len or content with identifier identifier. See the raptor_guess_parser_name description for further details.

PARSER DESTRUCTORS

void raptor_free_parser(raptor_parser *parser)

Destroy a Raptor parser object.

PARSER MESSAGE CALLBACK METHODS

Several methods can be registered for the parser that return a variable-argument message in the style of printf(3). These also return a raptor_locator that can contain URI, file, line, column and byte counts of where the message is about. This structure can be used with the raptor_format_locator, raptor_print_locator functions below or the structures fields directly, which are defined in raptor.h

void raptor_set_fatal_error_handler(raptor_parser* parser, void *user_data, raptor_message_handler handler)

Set fatal error handler callback.

void raptor_set_error_handler(raptor_parser* parser, void *user_data, raptor_message_handler handler)

Set non-fatal error handler callback.

void raptor_set_warning_handler(raptor_parser* parser, void *user_data, raptor_message_handler handler)

Set warning message handler callback.

PARSER STATEMENT CALLBACK METHOD

The parser allows the registration of a callback function to return the statements to the application.

void raptor_set_statement_handler(raptor_parser* parser, void *user_data, raptor_statement_handler handler)

Set the statement callback function for the parser. The raptor_statement structure is defined in raptor.h and includes fields for the subject, predicate, object of the statements along with their types and for literals, language and datatype.

PARSER PARSING METHODS

These methods perform the entire parsing in one method. Statements warnings, errors and fatal errors are delivered via the registered statement, error etc. handler functions.

In both of these methods, the base URI is required for the RDF/XML parser (name "rdfxml") and Turtle parser (name "turtle"). The N-Triples parser (name "ntriples") or RSS Tag Soup parser (name "rss-tag-soup") do not use this.

int raptor_parse_file(raptor_parser* parser, raptor_uri *uri, raptor_uri *base_uri)

Parse the given filename (a URI like file:filename) according to the optional base URI base_uri. If uri is NULL, read from standard input and base_uri is then required.

int raptor_parse_file_stream(raptor_parser* parser, FILE* stream, const char* filename, raptor_uri *base_uri)

Parse the given C FILE* stream according to the base URI base_uri (required). filename is optional and if given, is used for error messages via the raptor_locator structure.

int raptor_parse_uri(raptor_parser* parser, raptor_uri* uri, raptor_uri *base_uri)

Parse the URI according to the base URI base_uri, or NULL if not needed. If no base URI is given, the uri is used. This method depends on the raptor_www subsystem (see WWW Class section below) and an existing underlying URI retrieval implementation such as libcurl, libxml or BSD libfetch to retrieve the content.

PARSER CHUNKED PARSING METHODS

These methods perform the parsing in parts by working on multiple chunks of memory passed by the application. Statements warnings, errors and fatal errors are delivered via the registered statement, error etc. handler functions.

int raptor_start_parse(raptor_parser* parser, const char *uri)

Start a parse of chunked content with the base URI uri or NULL if not needed. The base URI is required for the RDF/XML parser (name "rdfxml") and Turtle parser (name "turtle"). The N-Triples parser (name "ntriples") or RSS Tag Soup parser (name "rss-tag-soup") do not use this.

int raptor_parse_chunk(raptor_parser* parser, const unsigned char *buffer, size_t len, int is_end)

Parse the memory at buffer of size len returning statements via the statement handler callback. If is_end is non-zero, it indicates the end of the parsing stream. This method can only be called after raptor_start_parse.

PARSER UTILITY METHODS

const char* raptor_get_mime_type(raptor_parser* rdf_parser)

Return the MIME type for the parser.

void raptor_set_parser_strict(raptor_parser *parser, int is_strict)

Set the parser to strict (is_strict not zero) or lax (default) mode. The detail of the strictness can be controlled by raptor_set_feature.

int raptor_set_feature(raptor_parser *parser, raptor_feature feature, int value)

Set a parser feature feature to a particular value. Returns non 0 on failure or if the feature is unknown. The current defined features are: Feature Values RAPTOR_FEATURE_SCANNING Boolean (non 0 true) RAPTOR_FEATURE_ASSUME_IS_RDF Boolean (non 0 true) RAPTOR_FEATURE_ALLOW_NON_NS_ATTRIBUTES Boolean (non 0 true) RAPTOR_FEATURE_ALLOW_OTHER_PARSETYPES Boolean (non 0 true) RAPTOR_FEATURE_ALLOW_BAGID Boolean (non 0 true) RAPTOR_FEATURE_ALLOW_RDF_TYPE_RDF_LIST Boolean (non 0 true) RAPTOR_FEATURE_NORMALIZE_LANGUAGE Boolean (non 0 true) RAPTOR_FEATURE_NON_NFC_FATAL Boolean (non 0 true) If the scanning feature is true (default false), then the RDF/XML parser will look for embedded rdf:RDF elements inside the XML content, and not require that the XML start with an rdf:RDF root element.

If the assume_is_rdf feature is true (default false), then the RDF/XML parser will assume the content is RDF/XML, not require that rdf:RDF root element, and immediately interpret the content as RDF/XML.

If the allow_non_ns_attributes feature is true (default true), then the RDF/XML parser will allow non-XML namespaced attributes to be accepted as well as rdf: namespaced ones. For example, ’about’ and ’ID’ will be interpreted as if they were rdf:about and rdf:ID respectively.

If the allow_other_parsetypes feature is true (default true) then the RDF/XML parser will allow unknown parsetypes to be present and will pass them on to the user. Unimplemented at present.

If the allow_bagid feature is true (default true) then the RDF/XML parser will support the rdf:bagID attribute that was removed from the RDF/XML language when it was revised. This support may be removed in future.

If the allow_rdf_type_rdf_list feature is true (default false) then the RDF/XML parser will generate the idList rdf:type rdf:List triple in the handling of rdf:parseType="Collection". This triple was removed during the revising of RDF/XML after collections were initially added.

If the normalize_language feature is true (default true) then XML language values such as from xml:lang will be normalized to lowercase.

If the non_nfc_fatal feature is true (default false) then illegal Unicode Normal Form C in literals will give a fatal error, otherwise it gives a warning.

int raptor_get_feature(raptor_parser* parser, raptor_feature feature)

Get parser features, the allowed feature values are available via raptor_features_enumerate.

raptor_locator* raptor_get_locator(raptor_parser* rdf_parser)

Return the current raptor_locator object for the parser. This is a public structure defined in raptor.h that can be used directly, or formatted via raptor_print_locator.

void raptor_get_name(raptor_parser *parser)

Return the string short name for the parser.

void raptor_get_label(raptor_parser *parser)

Return a string label for the parser.

void raptor_set_default_generate_id_parameters(raptor_parser* rdf_parser, char *prefix, int base)

Control the default method for generation of IDs for blank nodes and bags. The method uses a short string prefix and an integer base to generate the identifier which is not guaranteed to be a strict concatenation. If prefix is NULL, the default is used. If base is less than 1, it is initialised to 1.

void raptor_set_generate_id_handler(raptor_parser* parser, void *user_data, raptor_generate_id_handler handler)

Allow full customisation of the generated IDs by setting a callback handler and associated user_data that is called whenever a blank node or bag identifier is required.

PARSER UTILITY FUNCTIONS

int raptor_parsers_enumerate(const unsigned int counter, const char **name, const char **label)

Return the parser name/label for a parser with a given integer counter, returning non-zero if no such parser at that offset exists. The counter should start from 0 and be incremented by 1 until the function returns non-zero.

int raptor_syntaxes_enumerate(const unsigned int counter, const char **name, const char **label, const char **mime_type, const unsigned char **uri-string)

Return the name, label, mime type or URI string (all optional) for a parser syntax with a given integer counter, returning non-zero if no such syntax parser at that offset exists. The counter should start from 0 and be incremented by 1 until the function returns non-zero.

int raptor_features_enumerate(const raptor_feature feature, const char **name, raptor_uri **uri, const char **label)

Return the name, URI, string label (all optional) for a parser feature, returning non-zero if no such feature exists.

int raptor_syntax_name_check(const char *name)

Check name is a known syntax name.

const char* raptor_guess_parser_name(raptor_uri *uri, const char *mime_type, const unsigned char *buffer, size_t len, const unsigned char *identifier)

Guess a parser name for a syntax identified by URI uri, MIME type mime_type, some initial content buffer of size len or with content identifier identifier. All of these parameters are optional and only used if not NULL. The parser is chosen by scoring the hints that are given.

raptor_feature raptor_feature_from_uri(raptor_uri *uri)

Turn a URI uri into a raptor feature identifier, or <0 if the feature is unknown.

STATEMENT UTILITY FUNCTIONS

void raptor_print_statement(const raptor_statement* const statement, FILE *stream)

Print a raptor statement object in a simple format for debugging only. The format of this output is not guaranteed to remain the same between releases.

void raptor_print_statement_as_ntriples(const raptor_statement* statement, FILE *stream)

Print a raptor statement object in N-Triples format, using all the escapes as defined in http://www.w3.org/TR/rdf-testcases/#ntriples

raptor_statement_part_as_counted_string(const void *term, raptor_identifier_type type, raptor_uri* literal_datatype, const unsigned char *literal_language, size_t* len_p)

char* raptor_statement_part_as_string(const void *term, raptor_identifier_type type, raptor_uri* literal_datatype, const unsigned char *literal_language)

Turns part of raptor statement into N-Triples format, using all the escapes as defined in http://www.w3.org/TR/rdf-testcases/#ntriples The part (subject, predicate, object) of the raptor_statement is passed in as term, the part type (subject_type, predicate_type, object_type) is passed in as type. When the part is a literal, the literal_datatype and literal_language fields are set, otherwise NULL (usually object_datatype, object_literal_language).

If raptor_statement_part_as_counted_string is used, the length of the returned string is stored in *len_p if not NULL.

LOCATOR UTILITY FUNCTIONS

int raptor_format_locator(char *buffer, size_t length, raptor_locator* locator)

This method takes a raptor_locator object as passed to an error, warning or other handler callback and formats it into the buffer of size length bytes. If buffer is NULL or length is insufficient for the size of the formatted locator, returns the number of additional bytes required in the buffer to write the locator.

In particular, if this form is used: length=raptor_format_locator(NULL, 0, locator) it will return in length the size of a buffer that can be allocated for locator and a second call will perform the formatting: raptor_format_locator(buffer, length, locator)

void raptor_print_locator(FILE *stream, raptor_locator* locator)

This method takes a raptor_locator object as passed to an error, warning or other handler callback, formats and prints it to the given stdio stream.

N-TRIPLES UTILITY FUNCTIONS

void raptor_print_ntriples_string(FILE* stream, const char* string, const char delim)

This is a standalone function that prints the given string according to N-Triples escaping rules, expecting to be delimited by the character delim which is usually either " or <

const char* raptor_ntriples_term_as_string (raptor_ntriples_term_type term)

XML UTILITY FUNCTIONS

size_t raptor_xml_escape_string(const unsigned char *string, size_t len, unsigned char *buffer, size_t length, char quote, raptor_message_handler error_handler, void *error_data)

Apply the XML escaping rules to the string given in (string, len) into the buffer of size length. If quote is given, the escaped content is for an XML attribute and the appropriate quote character XML element content (CDATA). The error_handler method along with error_data allow error reporting to be given. If buffer is NULL, returns the size of the buffer required to escape. Otherwise the return value is the number of bytes used or 0 on failure.

MEMORY UTILITY FUNCTIONS

void raptor_free_memory(void *ptr)

Free memory allocated inside raptor.

UNICODE UTILITY FUNCTIONS

int raptor_unicode_char_to_utf8(unsigned long c, unsigned char *output)

Turn a Unicode character into UTF8 bytes in output of size c bytes which must be of sufficient size. Returns the number of bytes encoded or <0 on failure.

int raptor_utf8_to_unicode_char(unsigned long *output, const unsigned char *input, int length)

Decode a sequence UTF8 bytes in input of size length into a Unicode character in output returning the number of bytes used or <0 on failure.

MISCELLANEOUS UTILITY FUNCTIONS

char* raptor_vsnprintf(const char *message, va_list arguments)

Compatibility wrapper around vsnprintf.

STATIC VARIABLES

There are several read-only static variables in the raptor library:

const char * const raptor_short_copyright_string

Short copyright string, suitable for one line.

const char * const raptor_copyright_string

Full copyright over several lines including URLs.

const char * const raptor_version_string

The version as a string

const unsigned int raptor_version_major

The major version number as an integer.

const unsigned int raptor_version_minor

The minor version number as an integer.

const unsigned int raptor_version_release

The release version number as an integer.

const unsigned int raptor_version_decimal

The version number as a single decimal.

URI CLASS

Raptor has a raptor_uri class must be used for manipulating and passing URI references. The default internal implementation uses char* strings for URIs, manipulating them and constructing them. This URI implementation can be replaced by any other that provides the equivalent functionality, using the raptor_uri_set_handler function.

URI CONSTRUCTORS

There a several constructors for raptor_uri to build them from char* strings and existing raptor_uri objects.

raptor_uri* raptor_new_uri(const unsigned char* uri_string)

Create a raptor URI from a string URI-reference uri_string.

raptor_uri* raptor_new_uri_from_uri_local_name(raptor_uri* uri, const unsigned char* local_name)

Create a raptor URI from a string URI-reference local_name relative to an existing URI-reference. This performs concatenation of the local_name to the uri and not relative URI resolution, which is done by the raptor_new_uri_relative_to_base constructor.

raptor_uri* raptor_new_uri_relative_to_base(raptor_uri* base_uri, const unsigned char* uri_string)

Create a raptor URI from a string URI-reference uri_string using relative URI resolution to the base_uri.

raptor_uri* raptor_new_uri_from_id(raptor_uri* base_uri, const unsigned char* id)

Create a raptor URI from a string RDF ID id concatenated to the base_uri base URI.

raptor_uri* raptor_new_uri_for_rdf_concept(const char* name)

Create a raptor URI for the RDF namespace concept name.

raptor_uri* raptor_new_uri_for_xmlbase(raptor_uri* old_uri))

Create a raptor URI suitable for use with xml:base (throw away fragment)

URI DESTRUCTOR

void raptor_free_uri(raptor_uri* uri)

Destroy a raptor URI object.

URI METHODS

int raptor_uri_equals(raptor_uri* uri1, raptor_uri* uri2)

Return non-zero if the given URIs are equal.

raptor_uri* raptor_uri_copy(raptor_uri* uri)

Return a copy of the given raptor URI uri.

unsigned char* raptor_uri_as_counted_string(raptor_uri *uri, size_t* len_p)

unsigned char* raptor_uri_as_string(raptor_uri* uri)

Return a shared pointer to a string representation of the given raptor URI uri. This string is shared and must not be freed. If raptor_uri_as_counted_string is used, the length of the returned string is stored in *len_p if not NULL.

URI UTILITY FUNCTIONS

void raptor_uri_resolve_uri_reference (const unsigned char* base_uri, const unsigned char* reference_uri, char unsigned* buffer, size_t length)

This is a standalone function that resolves the relative URI reference_uri against the base URI base_uri according to the URI resolution rules in RFC2396. The resulting URI is stored in buffer which is of length bytes. If this is too small, no work will be done.

char *raptor_uri_filename_to_uri_string(const unsigned char* filename)

This is a standalone function that turns a local filename (Windows or Unix style as appropriate for platform) into a URI string (file). The returned string must be freed by the caller.

char *raptor_uri_uri_string_to_filename(const unsigned char* uri_string)

This is a standalone function that turns a URI string that represents a local filename (file:) into a filename. The returned string must be freed by the caller.

int raptor_uri_is_file_uri(const unsigned char* uri_string)

Returns non-zero if the given URI string represents a filename, is a file: URI.

URI CLASS IMPLEMENTATION

void raptor_uri_set_handler(raptor_uri_handler *handler, void *context)

Change the URI class implementation to the functions provided by the handler URI implementation. The context user data is passed in to the handler URI implementation calls.

void raptor_uri_get_handler(raptor_uri_handler **handler, void **context)

Return the current raptor URI class implementation handler and context

WWW CLASS

This is a small wrapper class around existing WWW libraries in order to provide HTTP GET or better URI retrieval for Raptor. It is not intended to be a general purpose WWW retrieval interface.

WWW CLASS INITIALISATION AND CLEANUP

void raptor_www_init(void)

void raptor_www_finish(void)

Initialise or terminate the raptor_www infrastructure. raptor_www_init and raptor_finish are called by raptor_init and raptor_finish respecitively, otherwise must be called once each.

NOTE

Several of the WWW library implementations require once-only initialisation and termination functions to be called, however raptor cannot determine whether this is already done before the library is initialised in raptor_www_init or terminated in raptor_www_finish, so always performs it. This can be changed by raptor_www_no_www_library_init_finish.

void raptor_www_no_www_library_init_finish(void)

If this is called before raptor_www_init, it will not call the underlying WWW library global initialise or terminate functions. The application code must perform both operations.

For example with curl, after this function is called, neither curl_global_init nor curl_global_cleanup will be called during raptor_www_init or raptor_www_finish respectively.

WWW CONSTRUCTORS

raptor_www *raptor_www_new(void)

raptor_www *raptor_www_new_with_connection(void* connection)

Create a raptor WWW object capable of URI retrieval. If connection is given, it must match the connection object of the underlying WWW implementation. At present, this is only for libcurl, and allows you to re-use an existing curl handle, or use one which has been set up with some desired qualities.

WWW DESTRUCTOR

void raptor_www_free(raptor_www *www)

Destroy a raptor WWW object.

WWW METHODS

void raptor_www_set_user_agent(raptor_www *www, const char *user_agent)

Set the user agent, for HTTP requests typically.

void raptor_www_set_proxy(raptor_www *www, const char *proxy)

Set the HTTP proxy - usually a string of the form http://server:port

raptor_www_set_write_bytes_handler(raptor_www *www, raptor_www_write_bytes_handler handler, void *user_data)

Set the handler to receive bytes written by the raptor_www implementation.

void raptor_www_set_content_type_handler(raptor_www *www, raptor_www_content_type_handler handler, void *user_data)

Set the handler to receive the HTTP Content-Type value, when/if discovered during retrieval by the raptor_www implementation.

void raptor_www_set_http_accept(raptor_www *www, const char *value);

Set the WWW HTTP Accept: header to value. If value is NULL, an empty header is sent.

void raptor_www_set_error_handler(raptor_www *www, raptor_message_handler error_handler, void *error_data)

Set the error handler routine for the raptor_www class. This takes the same arguments as the raptor_parser error, warning handler methods.

void* raptor_www_get_connection(raptor_www *www)

Return the underlying WWW library connection object. For example, for libcurl this is the curl_handle.

WWW ACTION METHODS

int raptor_www_fetch(raptor_www *www, raptor_uri *uri)

Retrieve the given URL, returning non zero on failure.

void raptor_www_abort(raptor_www *www, const char *reason)

Abort an ongoing raptor WWW operation. Typically used within one of the raptor WWW handlers.

QNAME CLASS

This is a class for handling XML QNames consisting of the pair of (a URI from a namespace, a local name) along with an optional value -- useful for XML attributes. This is used with the raptor_namespace_stack and raptor_namespace classes to handle a stack of raptor_namespace that build on raptor_qname.

QNAME CONSTRUCTORS

There are two constructors for raptor_qname to build qnames with optional values on a stack of names.

raptor_qname* raptor_new_qname(raptor_namespace_stack *nstack, const unsigned char *name, const unsigned char *value, raptor_simple_message_handler error_handler, void *error_data)

Create a raptor QName name (a possibly :-separated name) with name to be resolved against the given nstack namespace stack. An optional value can be given, and if there is an error, the error_handler and error_data will be used to invoke the callback.

raptor_qname* raptor_new_qname_from_namespace_local_name (raptor_namespace *ns, const unsigned char *local_name, const unsigned char *value)

Create a raptor QName using the namespace name of the raptor_namespace ns and the local name local_name, along with optional value value. Errors are reported using the error handling and data of the namespace.

QNAME DESTRUCTOR

void raptor_free_qname(raptor_qname* name)

Destroy a raptor qname object

QNAME METHODS

int raptor_qname_equal(raptor_qname *name1, raptor_qname *name2)

Return non-zero if the given QNames are equal.

QNAME UTILITY FUNCTIONS

raptor_uri* raptor_qname_string_to_uri(raptor_namespace_stack *nstack, const unsigned char *name, size_t name_len, raptor_simple_message_handler error_handler, void *error_data)

Return the URI corresponding to the QName according to the RDF method; concatenating the namespace’s name (URI) with the local name. Takes the same arguments as raptor_new_qname but does not create a raptor_qname object.

NAMESPACE CLASS

An XML namespace class - each entry is on a stack and consists of a name (URI) and prefix. The prefix or the name but not both may be empty. If the prefix is empty, it defines the default prefix. If the name is empty, it undefines the given prefix.

NAMESPACE CONSTRUCTOR

raptor_namespace* raptor_new_namespace(raptor_namespace_stack *nstack, const unsigned char *prefix, const unsigned char *ns_uri_string, int depth)

Create a raptor_namespace object on the given namespace stack nstack with prefix prefix and namespace name (URI string) ns_uri_string. If prefix is NULL, it defines the URI for the default namespace prefix. If the ns_uri_string is NULL, it undefines the given prefix in the current scope. Both may not be NULL. depth signifies the position of the namespace on the stack; 0 is the bottom of the stack and generally the first depth for user namespace declarations. Namespaces declared on the same depth (such as on the same XML element, typically) can be handily freed with raptor_namespaces_end_for_depth method on the namespace stack class.

NAMESPACE DESTRUCTOR

void raptor_free_namespace(raptor_namespace *ns)

Destroy a raptor namespace object.

NAMESPACE METHODS

raptor_uri* raptor_namespace_get_uri(const raptor_namespace *ns)

Return the namespace name (URI) of the namespace.

const unsigned char* raptor_namespace_get_prefix(const raptor_namespace *ns)

Return the prefix of the namespace.

unsigned char *raptor_namespaces_format(const raptor_namespace *ns, size_t *length_p)

Format the namespace as a string and return it as a new string, returning the length of the resulting string in length_p if it is not NULL. The string format is suitable for emitting in XML to declare the namespace.

NAMESPACE UTILITY FUNCTIONS

int raptor_namespace_copy(raptor_namespace_stack *nstack, raptor_namespace *ns, int new_depth)

Copy the namespace from the current stack to the new one, nstack at depth new_depth.

NAMESPACE STACK CLASS

A stack of raptor_namespace objects where the namespaces on top of the stack have wider scope and override earlier (lower) namespace declarations. Intended to match the XML namespace declaring semantics using xmlns attributes.

NAMESPACE STACK CONSTRUCTORS

raptor_namespace_stack* raptor_new_namespaces(raptor_uri_handler *uri_handler, void *uri_context, raptor_simple_message_handler error_handler, void *error_data, int defaults)

void raptor_namespaces_init(raptor_namespace_stack *nstack, raptor_uri_handler *handler, void *context, raptor_simple_message_handler error_handler, void *error_data, int defaults)

Create or initialise a new raptor_namespace_stack object with the given URI and error handlers. raptor_namespaces_new allocates new memory for the namespace stack and raptor_namespaces_init initialises an existing declared nstack, which could be statically allocated. Note that raptor_uri_get_handler can be useful to return the current raptor URI handler/context. The defaults argument describes which default namespaces are declared in the empty stack. At present, 0 is none, 1 for just the XML namespace and 2 is for a typical set of namespaces used for RDF, RDFS, Dublin Core, OWL, ... that may vary over time.

NAMESPACE STACK DESTRUCTORS

void raptor_free_namespaces(raptor_namespace_stack *nstack)

Destroy a namespace stack object, freeing the nstack (goes with raptor_new_namespaces).

void raptor_namespaces_clear(raptor_namespace_stack *nstack)

Clear a statically allocated namespace stack; does not free the nstack. (goes with raptor_namespaces_init).

NAMESPACE STACK METHODS

void raptor_namespaces_start_namespace(raptor_namespace_stack *nstack, raptor_namespace *nspace)

Start the given nspace on the stack, at the depth already defined.

int raptor_namespaces_start_namespace_full(raptor_namespace_stack *nstack, const unsigned char *prefix, const unsigned char *nspace, int depth)

Create a new raptor_namespace and start it on the stack. See raptor_new_namespace for the meaning of the argumens.

void raptor_namespaces_end_for_depth(raptor_namespace_stack *nstack, int depth)

End (and free) all namespaces on the stack at the given depth.

raptor_namespace* raptor_namespaces_get_default_namespace (raptor_namespace_stack *nstack)

Return the current default raptor_namespace of the namespace stack or NULL if there is none.

raptor_namespace *raptor_namespaces_find_namespace (raptor_namespace_stack *nstack, const unsigned char *prefix, int prefix_length)

Find the first namespace on the stack with the given namespace prefix or NULL if there is none.

int raptor_namespaces_namespace_in_scope(raptor_namespace_stack *nstack, const raptor_namespace *nspace)

Return non-zero if the raptor_namespace nspace is declared on the stack; i.e. in scope if this is a stack of XML namespaces.

SEQUENCE CLASS

A class for ordered sequences of items, adding at either end of the sequence. The method names should be familiar to Perl users.

SEQUENCE CONSTRUCTOR

raptor_sequence* raptor_new_sequence(raptor_sequence_free_handler* free_handler, raptor_sequence_print_handler* print_handler)

Create a new empty sequence, with optional handler for freeing elements (as used by raptor_free_sequence and printing out elements (used by raptor_sequence_print).

SEQUENCE DESTRUCTOR

void raptor_free_sequence(raptor_sequence* seq)

Destoy a sequence object, freeing any items if the free handler was defined in the constructor.

SEQUENCE METHODS

int raptor_sequence_size(raptor_sequence* seq)

Return the number of items in the sequence.

int raptor_sequence_set_at(raptor_sequence* seq, int idx, void *data)

Set the sequence item at index idx to the value data, extending it if necessary.

int raptor_sequence_push(raptor_sequence* seq, void *data)

Add item data to the end of the sequence.

int raptor_sequence_shift(raptor_sequence* seq, void *data)

Add item data to the start of the sequence.

void* raptor_sequence_get_at(raptor_sequence* seq, int idx)

Get the sequence item at index idx or NULL if no such index exists.

void* raptor_sequence_pop(raptor_sequence* seq)

Remove and return an item from the end of the sequence, or NULL if is empty.

void* raptor_sequence_unshift(raptor_sequence* seq)

Remove and return an item from the start of the sequence, or NULL if is empty.

void raptor_sequence_sort(raptor_sequence* seq, int(*compare)(const void *, const void *))

Sort the sequence using the given comparison function compare which is passed to qsort(3) internally.

int raptor_compare_strings(const void *a, const void *b)

Helper function useful with raptor_sequence_sort.

void raptor_sequence_set_print_handler(raptor_sequence *seq, raptor_sequence_print_handler *print_handler)

Set the print handler for the sequence, an alternative to setting it in the constructor.

void raptor_sequence_print_string(char *data, FILE *fh)

Helper print handler function useful for printing out sequences of strings.

void raptor_sequence_print_uri(char *data, FILE *fh)

Helper print handler function useful for printing out sequences of raptor_uri* objects.

void raptor_sequence_print(raptor_sequence* seq, FILE* fh)

Print out the sequence in a debug format to the given file handler fh. NOTE: The exact format is not guaranteed to remain the same between releases.

STRINGBUFFER CLASS

A class for growing strings, small chunks at a time.

STRINGBUFFER CONSTRUCTOR

raptor_stringbuffer* raptor_new_stringbuffer(void)

Create a new stringbuffer.

STRINGBUFFER DESTRUCTOR

void raptor_free_stringbuffer(raptor_stringbuffer* stringbuffer)

Destroy a stringbuffer.

STRINGBUFFER METHODS

int raptor_stringbuffer_append_counted_string(raptor_stringbuffer* stringbuffer, const unsigned char *string, size_t length, int do_copy)

Append a string of length bytes to a stringbuffer, copying it only if do_copy is non-0.

int raptor_stringbuffer_append_string(raptor_stringbuffer* stringbuffer, const unsigned char* string, int do_copy)

Append a string to a stringbuffer, copying it only if do_copy is non-0.

int raptor_stringbuffer_append_decimal(raptor_stringbuffer* stringbuffer, int integer)

Append a formatted decimal integer to a stringbuffer.

int raptor_stringbuffer_append_stringbuffer(raptor_stringbuffer* stringbuffer, raptor_stringbuffer* append)

Append a stringbuffer append to a stringbuffer. The append stringbuffer is emptied but not destroyed.

int raptor_stringbuffer_prepend_counted_string(raptor_stringbuffer* stringbuffer, const unsigned char* string, size_t length, int do_copy)

Prepend a string of length bytes to the start of a stringbuffer, copying it only if do_copy is non-0.

int raptor_stringbuffer_prepend_string(raptor_stringbuffer* stringbuffer, const unsigned char* string, int do_copy)

Prepend a string to the start of a stringbuffer, copying it only if do_copy is non-0.

unsigned char * raptor_stringbuffer_as_string(raptor_stringbuffer* stringbuffer)

Return the stringbuffer as a single string. The string is shared and should be copied if needed.

size_t raptor_stringbuffer_length(raptor_stringbuffer* stringbuffer)

Return the length of the stringbuffer.

API CHANGES

1.3.1

Correct raptor_print_statement declaration argument statement to have one less ’const’, to match the code.

1.3.0

Added the following parser methods, utility methods and helper functions:
raptor_new_parser_for_content (Parser class constructor)
, raptor_get_mime_type, raptor_get_feature, raptor_syntax_name_check, raptor_guess_parser_name, raptor_features_enumerate, raptor_feature_from_uri, raptor_www_set_http_accept (WWW class).

Changed raptor_set_feature to now return an int success or failure.

Added the following functions:
raptor_free_memory
, raptor_unicode_char_to_utf8, raptor_utf8_to_unicode_char and raptor_vsnprintf.

Added the raptor_sequence class, its constructor, destructor, methods and helper functions.

Added the raptor_stringbuffer class and constructor, destructor and methods.

Deprecated raptor_print_statement_detailed always intended to be internal.

1.2.0

Added raptor_syntaxes_enumerate to get full information on syntax mime type and URIs as well as name and label.
N-Triples Plus parser renamed to Turtle (name turtle)

1.1.0

Added N-Triples Plus parser (name ntriples-plus)
Made URI class constructors, methods and factory methods as well as some other utility functions using or returning URIs or literals take unsigned char* rather than char*. The affected calls are:
raptor_new_uri_func
, raptor_new_uri_from_local_name_func, raptor_new_uri_relative_to_base_func, raptor_uri_as_string_func, raptor_uri_as_counted_string_func : URI factory methods changed to all take/return unsigned char* for URI strings. raptor_statement_part_as_counted_string, raptor_statement_part_as_string, raptor_new_uri, raptor_new_uri_from_uri_local_name, raptor_new_uri_relative_to_base, raptor_uri_as_string, raptor_uri_as_counted_string, raptor_print_ntriples_string : Constructors and methods changed to take/return unsigned char* for URI strings. raptor_uri_resolve_uri_reference, raptor_uri_filename_to_uri_string, raptor_uri_uri_string_to_filename, raptor_uri_uri_string_to_filename_fragment, raptor_uri_is_file_uri : Changed to use unsigned char* for URI strings, char* for filenames, raptor_ntriples_string_as_utf8_string : Changed to return unsigned char* for UTF8 string.
Added raptor_parsers_enumerate to discover supported parsers.
Added raptor_uri_uri_string_to_filename_fragment with fragment arg to return the URI fragment.
Made the raptor_namespace, raptor_namespace_stack and raptor_qname class and APIs public.
Added feature non_nfc_fatal (see raptor_set_feature documentation).

1.0.0

Removed the following deprecated methods and functions (see 0.9.6 changes for the new names):
raptor_free
, raptor_new, raptor_ntriples_free, raptor_ntriples_new, raptor_ntriples_parse_file, raptor_ntriples_set_error_handler, raptor_ntriples_set_fatal_error_handler, raptor_ntriples_set_statement_handler and raptor_parser_abort.
Added raptor_parse_file_stream for reading FILE* streams without necessarily having a file.

0.9.12

Added raptor_new_uri_for_retrieval to turn URI references into URIs suitable for retrieval (no fragments).

0.9.11

Added raptor_get_name and raptor_get_label.
raptor_xml_escape_string
now takes error message handler, data pointer, loses parser argument.
Added raptor_set_default_generate_id_parameters and raptor_set_generate_id_handler to control the default generation of IDs, allow full customisation.

0.9.10

Added raptor_set_parser_strict and raptor_www_no_www_library_init_finish.
raptor_xml_escape_string
now takes an output string length pointer.
Added raptor_statement_part_as_counted_string, raptor_statement_part_as_string and raptor_parse_abort.
Deprecated raptor_parser_abort.

0.9.9

Added raptor_www class and all its constructors, destructor, methods, calls.
Added raptor_parse_uri, raptor_parser_abort, raptor_ntriples_term_as_string and raptor_xml_escape_string.

0.9.7

raptor_parse_chunk, raptor_new_uri_from_id, arguments are now unsigned char.
Added raptor_new_uri_for_xmlbase.

0.9.6

In this version, the raptor/ntriples parser calling APIs were modified. The following table lists the changes:

OLD API NEW API (0.9.6+)
raptor_new() raptor_new_parser("rdfxml")
ntriples_new() raptor_new_parser("ntriples")
raptor_free raptor_free_parser
ntriples_free raptor_ntriples_parser
raptor_ntriples_parse_file raptor_parse_file
raptor_ntriples_set_error_handler raptor_set_error_handler
raptor_ntriples_set_fatal_error_handler raptor_set_fatal_error_handler
raptor_ntriples_set_statement_handler raptor_set_statement_handler

CONFORMING TO

RDF/XML Syntax (Revised), Dave Beckett (ed.) W3C Recommendation, http://www.w3.org/TR/rdf-syntax-grammar/

N-Triples, in RDF Test Cases, Jan Grant and Dave Beckett (eds.) W3C Recommendation, http://www.w3.org/TR/rdf-testcases/#ntriples

Turtle - Terse RDF Triple Language, Dave Beckett, http://www.ilrt.bristol.ac.uk/discovery/2004/01/turtle/

SEE ALSO

rapper(1),raptor-config(1)

AUTHOR

Dave Beckett
Institute for Learning and Research Technology (ILRT)
University of Bristol


Copyright 2002-2004 Dave Beckett, Institute for Learning and Research Technology, University of Bristol