Parsing XML in PHP

PHP includes three XML parsers: one event-driven library based on the expat C library, one DOM-based library, and one for parsing simple XML documents named, appropriately, Simple XML

The most commonly used parser is the event-based library, which lets you parse but not validate XML documents. This means you can find out which XML tags are present and what they surround, but you can’t find out if they’re the right XML tags in the right structure for this type of document. In practice, this isn’t generally a big problem

PHP’s event-based XML parser calls various handler functions you provide while it reads the document as it encounters certain “events,” such as the beginning or end of an element

Element Handlers

When the parser encounters the beginning or end of an element, it calls the start and end element handlers. You set the handlers through the xml_set_element_handler() function:

xml_set_element_handler(parser, start_element, end_element);

The start_element and end_element parameters are the names of the handler functions

The start element handler is called when the XML parser encounters the beginning of an element:

startElementHandler(parser, element, &attributes);

Example of element handler

function startElement($parser, $name, $attributes)
{
 $outputAttributes = array();
if (count($attributes)) {
 foreach($attributes as $key) {
 $value = $attributes[$key];
 $outputAttributes[] = "<font color=\"gray\">{$key}=\"{$value}\"</font>";
 }
 }
 echo "&lt;<b>{$name}</b> " . join(' ', $outputAttributes) . '&gt;';
}

The end element handler is called when the parser encounters the end of an element:

endElementHandler(parser, element);

Parsing XML with DOM

The DOM parser provided in PHP is much simpler to use, but what you take out in complexity comes back in memory usage—in spades. Instead of firing events and allowing you to handle the document as it is being parsed, the DOM parser takes an XML document and returns an entire tree of nodes and elements:

$parser = new DOMDocument();
$parser->load("books.xml");
processNodes($parser->documentElement);
function processNodes($node) {
 foreach ($node->childNodes as $child) {
 if ($child->nodeType == XML_TEXT_NODE) {
 echo $child->nodeValue;
 }
 else if ($child->nodeType == XML_ELEMENT_NODE) {
 processNodes($child);
}
 }
}

More Example of XML parsing

example.php

<?php
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<movies>
 <movie>
  <title>PHP: Behind the Parser</title>
  <characters>
   <character>
    <name>Ms. Coder</name>
    <actor>Onlivia Actora</actor>
   </character>
   <character>
    <name>Mr. Coder</name>
    <actor>El Act&#211;r</actor>
   </character>
  </characters>
  <plot>
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  </plot>
  <great-lines>
   <line>PHP solves all my web problems</line>
  </great-lines>
  <rating type="thumbs">7</rating>
  <rating type="stars">5</rating>
 </movie>
</movies>
XML;
?>

Parse example.php using php

<?php
include 'example.php';

$movies = new SimpleXMLElement($xmlstr);

echo $movies->movie[0]->plot;
?>

Output of this XML parsing

So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary .

Leave a Comment