Converting the DocBook DTD to XML is much more challenging than converting the instances. It is probably not possible to construct an XML DTD that is identical to the validation power of DocBook. The list below identifies most of the issues that must be addressed, and describes how the DocBook XML DTD; deals with them:
Most of them have been moved to comment declarations preceding the markup declaration that used to contain them. A few small, inline comments that seemed like they would be out of context if moved before the declaration were simply deleted.
The small number of places in which DocBook uses name groups have been expanded.
There's one downside: DocBook uses %admon.class; in a name group to define the content model, and attribute lists for elements in the admonitions class. In DocBook XML, this convenience cannot be expressed. If additional admonitions are added, the element and attribute list declarations will have to be copied for them.
Graphic and InlineGraphic have been made EMPTY. The content model for SynopFragmentRef , the only RCDATA element in DocBook, has been changed to (arg | group)+.
In DocBook, exclusions are used to exclude the following:
Ubiquitous elements (indexterm and BeginPage) from a number of contexts in which they should not occur (such as metadata, for example).
Formal objects from Highlights, Examples, Figures and LegalNotices.
Formal objects and InformalTables from tables.
Removing these exclusions from DocBook XML means that it is now valid, in the XML sense, to do some things that don't make a lot of sense (like put a Footnote in a Footnote). Be careful.
Inclusions in DocBook are used to add the ubiquitious elements ( indexterm and BeginPage) unconditionally to a large number of contexts. In order to make these elements available in DocBook XML, they have been added to most of the parameter entities that include #PCDATA. If new locations are discovered where these terms are desired, DocBook XML will be updated.
The content models of many elements have been updated to make them a repeatable OR group beginning with #PCDATA.
The #CONREF attributes on indexterm, GlossSee, and GlossSeeAlso were changed to #IMPLIED. The content model of indexterm was modified so that it can be empty.