DTDs are a basic building block of any valid XML document. In fact, DTDs provide a simple reference grammar to validate our XML documents. A DTD tells to a validating XML parser what kind of content, attributes or elements can actually be contained within another element. In this post I'm going to discuss with you some basic aspects of the DTDs, explaining why they are so useful.
Linking a DTD from an XML document
We can link a DTD from an XML document in this way:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <!DOCTYPE book SYSTEM "book.dtd">
The attribute standalone
set to no
tells an user-agent to validate our document against the DTD declared below. This DTD is called book
and is stored in the book.dtd
file. Our XML document looks as follows:
<book> <pages>...</pages> <price cur="USD"> <high>...</high> <regular>...</regular> <discount>...</discount> </price> <ship>...</ship> <store>...</store> <weight>...</weight> </book>
Declaring relationships between elements
The relationships between elements are declared using the ELEMENT
notation block:
<!ELEMENT book (pages*, price*, ship*, store+, weight?)>
The above code means: "the element book
can contain only the pages
, price
, ship
, store
and weight
elements". The characters near each element mean:
- +: 1 or more times
- *: 0 or more times
- ?: the element is optional
Same thing applies to the price
element:
<!ELEMENT price (high?, regular, discount?)>
Declaring attribute values
Attribute values are declared using the ATTLIST
notation block:
<!ATTLIST price cur (USD|CAD|AUD|EUR) "USD">
First comes the element that contains the attribute (price
in this case), then the name of the attribute (here cur
) followed by a series of possible values enclosed between brackets and separated by a logical OR (which means an alternative). The last part is made up by the default value of such attribute.
Declaring the content of an element
Usually an element that doesn't contain other elements but text content is said to contain the PCDATA
data type:
<!ELEMENT pages (#PCDATA)> <!ELEMENT high (#PCDATA)> <!ELEMENT regular (#PCDATA)> <!ELEMENT discount (#PCDATA)> <!ELEMENT ship (#PCDATA)> <!ELEMENT store (#PCDATA)> <!ELEMENT weight (#PCDATA)>
This type stands for parsed character data and is normally used for simple text content.
Emulating namespaces in the DTD
DTDs don't provide a native support for XML namespaces, but we can emulate them using the CDATA
type and the FIXED
rule:
<!ATTLIST book xmlns CDATA #FIXED "http://site.com/ns/book">