Parsing XHTML with the PHP DOM extension

When served as application/xhtml+xml, XHTML is simply treated as XML so we can parse an XHTML document as if it was an XML file. Here's how it can be done using the PHP DOM extension: given the following XHTML fragment

<div id="test">
  <p>Test</p>
</div>

we can use the following PHP code:

$document = new DOMDocument();
$document->load('test.xhtml');

$test = $document->getElementById('test');
$p = $test->getElementsByTagName('p')->item(0)->firstChild->nodeValue;

echo $p; // 'Test'

Obviously we can parse this way even more complex XHTML documents, because the DOM extension allows us to access every single node within an XHTML document.

Leave a Reply

Note: Only a member of this blog may post a comment.