XML: Frequently Asked Questions (FAQ)

In this post I'll try to answer to the most frequently asked questions (FAQ) about XML. I think there's currently a lot of confusion among web developers regarding this standard. There's actually an urgent need for an explanation that aims to be as clear and concise as possible.

Is XML a programming language?

No. XML is a markup language derived from SGML, exactly as HTML.

Is the XML prolog mandatory?

Yes. Each XML document must have an XML prolog at the very beginning of the document:

<?xml version="1.0" encoding="utf-8"?>

The W3C Validator says that my XML is well-formed but not valid. What does this mean?

A well-formed XML document implies that such document follows correctly the formal grammar of XML. A valid XML document is a well-formed document that also provides a DTD (through a DOCTYPE) to be validated against.

All special characters must be converted into entities?

Yes. Although some user-agents are able to recognize some characters (such as " or ') and expand them accordingly, it's a recommended best practice to properly turn every special character into a SGML entity. This reduces the likelihood of inconsistencies during parsing and validation.

All elements must be properly nested?

Yes. In XML, you cannot write this:

<b><a></b></a>

The above code is illegal. Instead, you should write:

<b><a></a></b>

Do I have to enclose attribute values between quotes?

Yes. Write <element attr="value"></element> instead of <element attr=value></element>.

Is XML case-sensitive?

Yes. This rule applies both to elements, attributes and attribute's values.

What is the recommended encoding for XML?

UTF-8. In some cases you can also use UTF-16.

What is the default content type for XML?

You can serve XML either as text/xml or application/xml. The former content type was mainly used in the early days of XML, when a backward compatibility strategy was a must. In fact, serving XML as a subset of text allows obsolete user-agents to display XML documents as if they were plain text.

A root element is always required?

Yes. An XML document without a root element is illegal.

Can I use more than a single root element?

No. This is illegal.

Namespace URIs must always point to a real web resource?

No. They are only a system that avoids element names collision.

Can someone steal my namespaces?

All web developers "steal" namespaces when they use XHTML, SVG or MathML, because they're actually using the W3C namespaces which use W3C URIs. There's no stealing in XML, only sharing. wink

This entry was posted in by Gabriele Romanato. Bookmark the permalink.

Leave a Reply

Note: Only a member of this blog may post a comment.