CodeGen customizations

Customizations are used to control how CodeGen interprets your schemas and generates both Java code and a corresponding JiBX binding. The customizations are normally supplied to CodeGen in the form of an XML document, though certain customizations can alternatively be set by means of command line parameters.

Customization document structure

The general form of a customizations document consists of a root 'schema-set' element which applies to all the schemas being used, within which there may be both other 'schema-set' elements and 'schema' elements for individual schemas. If you're only working with a single schema, you can instead skip the 'schema-set' element and use the 'schema' element directly as the root of your customizations. Within a 'schema' element, you can use elements with names corresponding to various schema definition components ('attribute', 'complexType', 'element', etc.) to customize specific schema components, using XPath-like features to relate the customization elements to particular instances of the corresponding schema element.

Sound confusing? It's really pretty simple and intuitive, once you get the basic concepts down. Here's a sample using nested 'schema-set' and 'schema' elements for a complex collection of schemas, to show how this works:

<schema-set xmlns:xs="http://www.w3.org/2001/XMLSchema"
    type-substitutions="xs:integer xs:int xs:decimal xs:float">
  <schema-set package="org.ota.air" names="OTA_Air*.xsd">
    <schema-set generate-all="false" prefer-inline="true"
        names="OTA_AirCommonTypes.xsd OTA_AirPreferences.xsd"/>
    <schema name="OTA_AirAvailRS.xsd">
      <element path="element[@name=OTA_AirAvailRS]/**/element[@name=OriginDestinationOption]"
        ignore="true"/>
    </schema>
  </schema-set>
  <schema-set package="org.ota.hotel" names="OTA_Hotel*.xsd">
    <schema-set generate-all="false" prefer-inline="true"
        names="OTA_HotelCommonTypes.xsd OTA_HotelContentDescription.xsd
        OTA_HotelEvent.xsd OTA_HotelPreferences.xsd OTA_HotelReservation.xsd
        OTA_HotelRFP.xsd"/>
  </schema-set>
  ...

In this customizations document, the root 'schema-set' element sets some customization options which apply to the full collection of schemas. The child 'schema-set' elements each specify a particular subset of the schemas (which must be distinct from those specified by sibling schema-set elements); in this example, the first child 'schema-set' applies to schemas with names matching the "OTA_Air*.xsd" pattern, the second to schemas with names matching "OTA_Hotel*.xsd". The first child 'schema-set' has yet another 'schema-set' element and a 'schema' element as its children. Within the 'schema' element, an 'element' child is used to customize the handling of one particular xs:element component within the schema definition.

'schema-set' and 'schema' customizations

Customization attributes which apply at the schema level can be used with both 'schema-set' and 'schema' elements. These elements can be nested in the customizations document, and customizations are inherited through the nesting: A customization attribute on a 'schema-set' applies to all the schemas in the set, but may be overridden by a different setting of the attribute on a nested 'schema-set' or 'schema' element. Here's the alphabetical list of these schema-level customization attributes:

Schema-level customization attributes

delete-annotations

Delete annotations from schema fragments shown in Javadocs. This generally makes the schema fragments easier to understand, especially since xs:documentation elements in the schema are normally converted to Javadocs in any case. Allowed values are true (the default) and false.

enumeration-type

Control the type of classes generated for enumerations. Allowed values are java5 for Java 5 enum classes (the default) and simple for simple typesafe enumeration classes compatible with all Java compiler versions.

generate-all

If the value is false, skip any unused global schema definitions in the code generation. This is intended for use with schemas referenced by xs:include or xs:import, which often include definitions not needed by the original schema. By skipping code generation for these unnecessary definitions you can reduce the number of classes in the generated data model. Allowed values are true (the default) and false.

import-docs

Convert xs:documentation annotations in the schema definition to Javadocs in the generated code if true. Allowed values are true (the default) and false.

line-width

Specify the desired maximum line width in the generated Java code. The value can be any integer.

package

Give the name of the package to be used for generated Java code. The value can be any package name.

prefer-inline

If true, use inline definitions where possible rather than creating separate classees. Allowed values are true and false (the default).

repeated-type

Control how repeated schema components (both xs:list values, and particles with minOccurs > 1) are represented in Java code. Allowed values are array (for arrays), list (for untyped java.util.List), and typed (for Java 5 typed list, the default).

show-schema

If true, include schema fragments corresponding to the generated code in class Javadocs. The schema fragments included are based on a post-processing view of the schema, after processing type substitutions, deletions, and schema normalizations. Allowed values are true (the default) and false.

structure-optional

Control whether references to classes with no associated element and all components optional should be made optional in the generated binding. The effect of making such class references optional is that the reference will be set null when unmarshalling if none of the components are present, and will be checked for null when marshalling. Allowed values are true (the default) and false.

use-inner

Control whether inner classes are used for secondary structures within the generated Java code. If true inner classes will be used; otherwise, separate top-level classes will be used. This only applies for the equivalent of anonymous xs:complexType definitions, or definitions which have been inlined; top-level classes are always used when definitions are used in more than one place. Allowed values are true (the default) and false.

Besides these schema-level customizations, any of the nesting customizations listed in the next section can also be used on 'schema-set' and 'schema' elements. There are also some attributes which are only allowed with 'schema-set', and some which are only allowed with 'schema'. Here's the list of these attributes for 'schema-set':

Attributes only allowed on 'schema-set' element

names

List of name patterns for schemas included in this set. Individual patterns are whitespace-separated, and may include '*' characters as wildcards matching any number of arbitrary characters. Multiple '*' wildcards may be used within a single pattern, but may not be contiguous (i.e., there must be one or more regular characters between any pair of wildcards).

namespaces

List of namespace URIs for schemas included in this set. Individual URIs are whitespace-separated.

These attributes are not allowed on the root 'schema-set' element of a customizations document, but at least one is required on any nested 'schema-set' element (since they determine which schemas are actually included in the set). When multiple 'schema-set' child elements are used, the sets of schemas identified by each 'schema-set' must be disjoint.

Here's the list of attributes only allowed with 'schema' customization elements:

Attributes only allowed on 'schema' element

excludes

List of schema global definitions to be excluded from the code generation. Names are separated by whitespace characters. This overrides the normal reference checks used to determine which schema definitions are going to be generated as Java code.

includes

List of schema global definitions to be included in the code generation. Names are separated by whitespace characters. This overrides the normal reference checks used to determine which components are going to be generated as Java code.

name

The schema name, meaning the last component in the schema path (whether accessed from the file system, by using HTTP, or by any other means). No wildcard characters are allowed, so the name must be an exact match.

namespace

Schema target namespace URI. This can only be used to identify a schema if there's only one schema using that namespace.

Nesting customizations

Nesting customization attributes can be used on any customization element, including 'schema-set' and 'schema' along with customization elements matching schema element names used inside a 'schema' customization element. Here's the alphabetical list of these attributes:

Nesting customization attributes

any-handling

Controls how xs:any particles are represented in the generated Java code and binding definition. Allowed values are discard (meaning discard when unmarshalling and don't generate when marshalling), dom (meaning use a org.w3c.dom.Element or list of elements for a repeating xs:any, the default), and mapped (meaning require any element(s) matching the xs:any to be defined as a global element in the schema).

choice-exposed

When true, the generated code directly exposes xs:choice states to the user in the generated code. In this case the constants used for the choice states are made public, and there's an added stateXXX() method which returns the current state of the choice. Otherwise, the choice state is only exposed to the user via ifXXX() methods checking if a particular state has been set. Allowed values are true and false (the default).

choice-handling

Control how xs:choice is implemented in the generated code. xs:choice handling always uses a separate property for each alternative in the choice, and in most cases also uses a state variable that tracks the most-recent setting. There are several options for how the state is set and changed, though, and this customization selects the option to be used. Allowed values are stateless (meaning there is no state variable, and it's up to the user to make sure only one of the choice values is set), checkset (meaning that when the 'set' access method for one of the choice properties is called the code will throw an exception if a different choice had previously been set and the clearXXX() method has not been called), checkboth (meaning that in addition to the checkset check on setting a choice property, the choice property 'get' access methods will also check that the current state is either unset or matches that property), overset (meaning that when the 'set' access method for one of the choice properties is called it will overwrite any previous choice), and overboth (meaning 'set' methods overwrite previous choices, while 'get' access methods check that the current state is either unset or matches that property).

enforced-facets

This is included for use once xs:simpleType facet handling is implemented, but is currently ignored.

ignored-facets

This is included for use once xs:simpleType facet handling is implemented, but is currently ignored.

union-exposed

This is included for use once full xs:union handling is implemented, but is currently ignored.

union-handling

This is included for use once full xs:union handling is implemented, but is currently ignored.

type-substitutions

Defines type substitutions to be applied before generating code. The substitutions are given as pairs of type names, with the type to be replaced first and the type to be substituted second. The type names are all treated as namespace-qualified values, and are separated by one or more whitespace characters.

Schema component customizations

Schema component customizations each apply to a particular element within a schema definition. The element name for the customization always matches the name of the schema definition element being customized (but without namespace). All schema component customizations can use any of the nesting customization attributes defined in the last section, and also the following attributes:

Common component customization attributes

path

Path to the schema element to be customized. The path is in XPath-like form, with path steps separated by '/' characters. '*' can be used as a path step matching any arbitrary schema element, and '**' as a path step matching any nesting of arbitrary schema elements, with the restriction that these wildcard steps cannot be used as the initial path step if the customization element is a direct child of a 'schema' customization - in other words, you can only use wildcards once you've identified the global schema component involved. Steps matching named components of the schema definition (global type, group, or attribute group definitions, or element or attribute definitions whether global or not) can use a '[@name=...]' predicate to single out a particular instance of the component type (which will match either a 'name' or 'ref' attribute value in the schema definition). Steps can also use a numeric predicate '[n]' to identify which of several potential matches is being referenced, where the numbering starts at '1' for the first match (as in XPath). The last path step may be left empty, since the element name for this path step must always be the same as the customization element name.

position

Number of the instance to be matched. This is equivalent to a '[n]' predicate on the last step in a path expression, but as a convenience can be used directly for cases where no path expression is otherwise required.

Beyond these common attributes, different types of customization elements support one or more added attributes as defined below:

Specialized component customization attributes

class-name

Java class name used for the representation of the schema component (ignored if no class required). This must be a simple class name, without package information (since the package is determined on a per-schema basis). This attribute is allowed on the following types of customization elements: 'all', 'attribute', 'attributeGroup', 'choice', 'complexType', 'element', 'group', 'sequence', 'simpleType'.

exclude

Remove component from code generation if true. This effectively deletes the target component from the schema definition before code generation. Allowed values are true and false (the default). This attribute is allowed on the following types of customization elements: 'all', 'any', 'anyAttribute', 'attribute', 'attributeGroup', 'choice', 'complexType', 'element', 'group', 'sequence', 'simpleType'.

ignore

Ignore element or attribute when unmarshalling documents. This drops the component from the generated data model, but accepts it and discards any content when unmarshalling (as opposed to the exclude behavior, which completely removes the component from the schema definition, meaning input documents containing an excluded element will cause errors in unmarshalling). Allowed values are true and false (the default). This attribute is allowed on the following types of customization elements: 'attribute' and 'element'.

name

Schema component name attribute value. This is equivalent to an '[@name=XXX]' predicate on the last step in a path expression, but as a convenience can be used directly for cases where no path expression is otherwise required. The attribute is allowed on the following types of customization elements: 'attribute', 'attributeGroup', 'complexType', 'element', 'group', 'simpleType'.

type

Substitute type to be used for component. This is used to replace the type specified in the schema for the target component with some other type before code generation. It is similar to using the 'type-substitutions' nesting customization attribute, but supports replacing anonymous type definitions in addition to global types. This attribute is allowed on the following types of customization elements: 'attribute', 'complexType', 'element', 'simpleType'. Note: not yet supported

value-name

Name used for the Java property value representing the component. This customization is normally only useful when applied to nested components of a schema definition, rather than global definitions (so on 'group' references, rather than group definitions, for instance). It is allowed on the following types of customization elements: 'all', 'attribute', 'attributeGroup', 'choice', 'element', 'group', 'sequence'.

Controlling name handling

In addition to the customization elements defined earlier, you can also use 'name-converter' elements in your customization document to control how XML names are converted to Java names. If present, a 'name-converter' element must be the first child within a 'schema-set' or 'schema' element. It can be used in two ways: To change the behavior of the default name converter class used by CodeGen (org.jibx.schema.codegen.extend.DefaultNameConverter), or to completely replace that default name converter class with your own implementation.

When used to change the behavior of the default name converter class used by CodeGen, the following attributes apply:

Default name converter customization attributes

field-prefix

Prefix string to be added at the beginning of generated normal (non-static) field names. By default, the prefix string is empty.

field-suffix

Suffix string to be added at the end of generated normal (non-static) field names. By default, the suffix string is empty.

static-prefix

Prefix string to be added at the beginning of generated static field names. By default, the prefix string is empty.

static-suffix

Suffix string to be added at the end of generated static field names. By default, the suffix string is empty.

strip-prefixes

Prefix strings to be stripped from schema names before converting to Java names. The value is a list of prefix strings, separated by whitespace characters.

strip-suffixes

Suffix strings to be stripped from schema names before converting to Java names. The value is a list of suffix strings, separated by whitespace characters.

When used to replace the default name converter class with your own implementation, there's only one fixed attribute: The 'class' attribute gives the fully-qualified name of your name converter implementation class, which must implement the org.jibx.schema.codegen.extend.NameConverter interface. Other attributes can be used to set parameter values for your converter, using either set methods or field names. Attribute names are first converted to value names by removing any hyphen (dash) characters and converting the character following a hyphen to upper case, then your class is inspected to find a set access method matching the value name. If no set access method is found, the class is next checked for a field with the name obtained by adding a leading "m_" to the value name. It's an error if neither of these methods finds a way to set the value. This is the same approach as used for the "standard" attributes used with the default name converter class, listed above. If you want to use this approach, you probably want to see the source code for that class (org.jibx.schema.codegen.extend.DefaultNameConverter) and either use it as the base for your own implementation or just subclass and override methods selectively.

Extending code generation

You can also extend Java class generation to add special handling or features to the generated code. This is done using 'class-decorator' elements in your customization file. If present, 'class-decorator' elements must precede any other child elements (aside from 'name-converter', if used) within a 'schema-set' or 'schema' element. Multiple 'class-decorator' elements can be used, and by default the decorators are inherited by child 'schema-set' and 'schema' elements within the customizations. You can change this by using the inherit-decorators customization attribute.

There's one required attribute on a 'class-decorator' element, the 'class' attribute. This attribute must give the fully-qualified class name of a class implementing the org.jibx.schema.codegen.extend.ClassDecorator interface, which defines methods called by CodeGen during the process of generating a Java class. The method parameters give your implementation code ways to hook into the code generation, which is based on the Eclipse Abstract Syntax Tree (AST) model. You can then modify the AST in various ways to meet your needs.

Other attributes may also be used on a 'class-decorator' element, and if present are interpreted as values to be set on a decorator class instance. The principle used for these attributes is the same as when a custom class is used for the 'name-converter' element. Attribute names are first converted to value names by removing any hyphen (dash) characters and converting the character following a hyphen to upper case, then your class is inspected to find a set access method matching the value name. If no set access method is found, the class is next checked for a field with the name obtained by adding a leading "m_" to the value name. It's an error if neither of these methods finds a way to set the value.

Two decorators are provided with CodeGen, which can used directly and also as the basis for developing your own decorators (by looking at the source code). The org.jibx.schema.codegen.extend.SerializableDecorator is used to add the java.io.Serializable interface to generated data model classes. If a 'serial-version' attribute is used with this decorator it will also add a serialVersionUID value to each generated class, with the specified version (which must be a long integer value). The org.jibx.schema.codegen.extend.CollectionMethodsDecorator adds helper methods for collection values represented by java.util.List instances. The helper methods are int sizeXXX() to get the number of items in the list, void addXXX(type) to add an item to the list, type getXXX(int) to get an item by position, and void clearXXX() to remove all items from the list (where 'XXX' is the value name).

Command line customizations

If you just want to set some basic global customizations for CodeGen you do this using command-line parameters and avoid the need to create a customizations file. The special prefix "--" is used to do this. So to set delete-annotations="true" and any-handling="mapped", for instance, you'd add --delete-annotations=true and --any-handling=mapped to the BindGen command line. No quotes are needed for the attribute value when you use this technique. This technique only allows you to set global customizations, though, so if you're doing anything at the individual schema or component level you'll still need to use a customizations file.