PHP: walking the DOM

In this post I'm going to show you how to walk the DOM of an HTML document with PHP. First of all, a caveat: at the moment of this writing, the PHP's DOM implementation doesn't recognize HTML5 documents. If you try to load such files, some core DOM methods such as getElementById() will return null. sad The best thing you can do is always providing a validated XHTML document. In fact, another source of errors are non valid documents. Let's see how to accomplish our task:

$document = new DomDocument();
    $document->loadHTMLFile('php-dom.html');

    $startElement = $document->getElementById('test');
    $output;

    if($startElement->hasChildNodes()) {

      $index = 0;

      do {

       $index++;

       $element = $startElement->childNodes->item($index);

        if($element->nodeType == 1) {

            $output .=  '<p>' . $element->firstChild->nodeValue . ' ' . $element->nodeName . '</p>' . "\n";           

        } else {

            $output .= '<p>Blank node.</p>' . "\n";

        }

     } while($index < $startElement->childNodes->length);

    }

    echo $output;

We use the childNodes HTML node list to retrieve all the nodes within a target element (in this case an element with ID test) using a
do...while loop. You can see an example below.

Example

Live example

This entry was posted in by Gabriele Romanato. Bookmark the permalink.

Comments are closed.