 |
CLXXXI. XMLReader functions
The XMLReader extension is an XML Pull parser. The reader acts as a
cursor going forward on the document stream and stopping at each node
on the way.
The XMLReader extension is available in PECL as of PHP 5.0.0 and is
included and enabled as of PHP 5.1.0 by default. It can be enabled
by adding the argument --enable-xmlreader
(or --with-xmlreader before 5.1.0)
to your configure line. The libxml
extension is required.
表 1. | Name | Type | Read-only | Description |
|---|
| attributeCount | int | yes | The number of attributes on the node | | baseURI | string | yes | The base URI of the node | | depth | int | yes | Depth of the node in the tree starting at 0 | | hasAttributes | bool | yes | Indicates if node has attributes | | hasValue | bool | yes | Indicates if node has a text value | | isDefault | bool | yes | Indicates if attribute is defaulted from DTD | | isEmptyElement | bool | yes | Indicates if node is an empty element tag | | localName | string | yes | The local name of the node | | name | string | yes | The qualified name of the node | | namespaceURI | string | yes | The URI of the namespace associated with the node | | nodeType | int | yes | The node type for the node | | prefix | string | yes | The prefix of the namespace associated with the node | | value | string | yes | The text value of the node | | xmlLang | string | yes | The xml:lang scope which the node resides |
以下常量由本扩展模块定义,因此只有在本扩展模块被编译到
PHP 中,或者在运行时被动态加载后才有效。 | 警告 |
XMLReader uses class constants since PHP 5.1. Prior releases use global
constants in the form XMLREADER_ELEMENT.
|
表 2. XMLReader Node Types | Constant | Value | Description |
|---|
|
XMLReader::NONE
(integer)
| 0 | No node type | |
XMLReader::ELEMENT
(integer)
| 1 | Start element | |
XMLReader::ATTRIBUTE
(integer)
| 2 | Attribute node | |
XMLReader::TEXT
(integer)
| 3 | Text node | |
XMLReader::CDATA
(integer)
| 4 | CDATA node | |
XMLReader::ENTITY_REF
(integer)
| 5 | Entity Reference node | |
XMLReader::ENTITY
(integer)
| 6 | Entity Declaration node | |
XMLReader::PI
(integer)
| 7 | Processing Instruction node | |
XMLReader::COMMENT
(integer)
| 8 | Comment node | |
XMLReader::DOC
(integer)
| 9 | Document node | |
XMLReader::DOC_TYPE
(integer)
| 10 | Document Type node | |
XMLReader::DOC_FRAGMENT
(integer)
| 11 | Document Fragment node | |
XMLReader::NOTATION
(integer)
| 12 | Notation node | |
XMLReader::WHITESPACE
(integer)
| 13 | Whitespace node | |
XMLReader::SIGNIFICANT_WHITESPACE
(integer)
| 14 | Significant Whitespace node | |
XMLReader::END_ELEMENT
(integer)
| 15 | End Element | |
XMLReader::END_ENTITY
(integer)
| 16 | End Entity | |
XMLReader::XML_DECLARATION
(integer)
| 17 | XML Declaration node |
表 3. XMLReader Parser Options | Constant | Value | Description |
|---|
|
XMLReader::LOADDTD
(integer)
| 1 | Load DTD but do not validate | |
XMLReader::DEFAULTATTRS
(integer)
| 2 | Load DTD and default attributes but do not validate | |
XMLReader::VALIDATE
(integer)
| 3 | Load DTD and validate while parsing | |
XMLReader::SUBST_ENTITIES
(integer)
| 4 | Substitute entities and expand references |
usedsk at aoeu dot zzn dot com
14-Jul-2007 11:19
I found it a little hard to parse nested elements, so wrote a function simplifies it (based off http://www.thescripts.com/forum/thread627281.html):
function read_mixed_xml($filename, $arrayBeginElem, $arrayEndElem)
{
$output = "";
$arrayBeginKeys = array_keys($arrayBeginElem);
$lengthBegin = count($arrayBeginElem); // Length of the begin array
$arrayEndKeys = array_keys($arrayEndElem);
$lengthEnd = count($arrayEndElem); // Length of end element array
$xmlReader = new XMLReader();
$xmlReader->open($filename);
$xmlReader->read(); // Skip root node
/* Go through the nodes */
while($xmlReader->read())
{
/* We're only parsing begin and #text nodes right now */
if($xmlReader->nodeType != XMLReader::END_ELEMENT)
{
switch($xmlReader->nodeType)
{
/* If the current node is a begin element, go through the array of begin elements, in search of the current node's name. If it is, append $arrayBeginElem's value for the current node's name to the $output. (Simulates case "paragraph":
$output .= "<p>"
break;
) */
case XMLReader::ELEMENT:
for($i = 0; $i < $lengthBegin; $i++)
{
$key = $arrayBeginKeys[i];
if($key==$xmlReader->name)
{
$output .= $arrayBeginElem[$key];
}
}
break;
/* If the current node is a #text node, append the node's value to $output */
case XMLReader::TEXT:
$output .= $xmlReader->value;
break;
}
}
/* If the current node is an end element, go through the array of end elements, and search for the current node's name. If found, append $arrayEndElem's value for the current node's name to the output */
else if($xmlReader->nodeType == XMLReader::END_ELEMENT)
{
for($i = 0; $i < $lengthEnd; $i++)
{
$key = $arrayEndKeys[i];
if($key==$xmlReader->name)
{
$output .= $arrayEndElem[$key];
}
}
}
}
$xmlReader->close();
return $output;
}
Example input:
$begin = array("title" => " <h1>", "paragraph" => " <p>", "italicized" => "<i>");
$end = array("title" => "</h1>", "paragraph" => "</p>", "italicized" => "</i>");
$content = read_mixed_xml("index.xml", $begin, $end);
echo $content;
index.xml:
<?xml version="1.0"?>
<body>
<title>Introduction</title>
<paragraph>
Lorem <italicized>ipsum dolor sit amet</italicized>, consectetuer adipiscing elit. Donec neque augue, nonummy sit amet, interdum vitae, egestas a, nulla. Aenean sed turpis eget lacus venenatis tincidunt. Integer in leo vitae est euismod congue. Curabitur quis tellus ut nulla pharetra fringilla. Phasellus id risus sagittis turpis lobortis pretium.
</paragraph>
<paragraph>
Curabitur ultrices pulvinar massa. Nullam ac massa. Morbi adipiscing pharetra est. In non neque vitae massa adipiscing vestibulum. Integer congue, lacus non sagittis consectetuer, magna nisl eleifend nisl, id fringilla justo justo et arcu.
</paragraph>
</body>
Example output:
<h1>Introduction</h1> <p>
Lorem <i>ipsum dolor sit amet</i>, consectetuer adipiscing elit. Donec neque augue, nonummy sit amet, interdum vitae, egestas a, nulla. Aenean sed turpis eget lacus venenatis tincidunt. Integer in leo vitae est euismod congue. Curabitur quis tellus ut nulla pharetra fringilla. Phasellus id risus sagittis turpis lobortis pretium.
</p> <p>
Curabitur ultrices pulvinar massa. Nullam ac massa. Morbi adipiscing pharetra est. In non neque vitae massa adipiscing vestibulum. Integer congue, lacus non sagittis consectetuer, magna nisl eleifend nisl, id fringilla justo justo et arcu.
</p>
jamespic at gmail dot nsopam com
15-Nov-2006 04:09
Example, as requested, with nested nodes.
<?php
ob_start();
?>
<root>
<folder>
<name>folder A</name>
<files>
<file>
<name>Afile 1</name>
</file>
<file>
<name>Afile 2</name>
</file>
</files>
</folder>
<folder>
<name>folder B</name>
<files>
<file>
<name>Bfile 1</name>
</file>
<file>
<name>Bfile 2</name>
</file>
</files>
</folder>
</root>
<?php
$xmldata = ob_get_contents();
ob_end_clean();
$xml = new XMLReader();
$xml->XML($xmldata);
$data = array();
while ($xml->read())
{
while($xml->depth<=2 && $xml->nodeType==1)
$xml->read();
if ($xml->nodeType==3 && $xml->depth==3) // NodeType 3 : Text Element
{
$strFolderName = $xml->value;
$data[$strFolderName]=array();
while($xml->depth<=3)
$xml->read();
while($xml->depth>=3)
{
//xdump();
if ($xml->nodeType==3)
$data[$strFolderName][] = $xml->value;
$xml->read();
}
}
}
print_r($data);
echo "\n";
?>
Output :
Array
(
[folder A] => Array
(
[0] => Afile 1
[1] => Afile 2
)
[folder B] => Array
(
[0] => Bfile 1
[1] => Bfile 2
)
)
jcatalaa at catium dot com
20-Mar-2006 07:52
DTD Validation
Parser properties can be set using:
$xml_reader->setParserProperty(XMLReader::CONSTANT_NAME, BoolenValue);
The constant setting in the xmlreader_validatedtd.php example that comes
with the xmlread package results in an error.
Here is how I got it to work...
<?php
$indent = 5; /* Number of spaces to indent per level */
$xml = new XMLReader();
$xml->open("dtdexample.xml");
// CHANGED NEXT TWO LINES TO REMOVE ERROR
// FROM: $xml->setParserProperty(XMLREADER_LOADDTD, TRUE);
$xml->setParserProperty(XMLReader::LOADDTD, TRUE);
$xml->setParserProperty(XMLReader::VALIDATE, TRUE);
while($xml->read()) {
/* Print node name indenting it based on depth and $indent var */
print str_repeat(" ", $xml->depth * $indent).$xml->name."\n";
if ($xml->hasAttributes) {
$attCount = $xml->attributeCount;
print str_repeat(" ", $xml->depth * $indent)." Number of Attributes: ".$xml->attributeCount."\n";
}
}
print "\n\nValid:\n";
var_dump($xml->isValid());
?>
orion at ftf-hq dot dk
15-Feb-2006 12:50
Some more documentation (i.e. examples) would be nice :-)
This is how I read some mysql parameters in an xml file:
<?php
$xml = new XMLReader();
$xml->open("config.xml");
$xml->setParserProperty(2,true); // This seems a little unclear to me - but it worked :)
while ($xml->read()) {
switch ($xml->name) {
case "mysql_host":
$xml->read();
$conf["mysql_host"] = $xml->value;
$xml->read();
break;
case "mysql_username":
$xml->read();
$conf["mysql_user"] = $xml->value;
$xml->read();
break;
case "mysql_password":
$xml->read();
$conf["mysql_pass"] = $xml->value;
$xml->read();
break;
case "mysql_database":
$xml->read();
$conf["mysql_db"] = $xml->value;
$xml->read();
break;
}
}
$xml->close();
?>
The XML file used:
<?xml version='1.0'?>
<MySQL_INIT>
<mysql_host>localhost</mysql_host>
<mysql_database>db_database</mysql_database>
<mysql_username>root</mysql_username>
<mysql_password>password</mysql_password>
</MySQL_INIT>
Ariel Gonzalez
11-Feb-2006 10:09
Simple function I used while playing around with XMLReader.
<?php
function dump_xmlreader($o) {
$node_types = array (
0=>"No node type",
1=>"Start element",
2=>"Attribute node",
3=>"Text node",
4=>"CDATA node",
5=>"Entity Reference node",
6=>"Entity Declaration node",
7=>"Processing Instruction node",
8=>"Comment node",
9=>"Document node",
10=>"Document Type node",
11=>"Document Fragment node",
12=>"Notation node",
13=>"Whitespace node",
14=>"Significant Whitespace node",
15=>"End Element",
16=>"End Entity",
17=>"XML Declaration node"
);
echo "attributeCount = " . $o->attributeCount . "\n";
echo "baseURI = " . $o->baseURI . "\n";
echo "depth = " . $o->depth . "\n";
echo "hasAttributes = " . ( $o->hasAttributes ? 'TRUE' : 'FALSE' ) . "\n";
echo "hasValue = " . ( $o->hasValue ? 'TRUE' : 'FALSE' ) . "\n";
echo "isDefault = " . ( $o->isDefault ? 'TRUE' : 'FALSE' ) . "\n";
echo "isEmptyElement = " . ( @$o->isEmptyElement ? 'TRUE' : 'FALSE' ) . "\n";
echo "localName = " . $o->localName . "\n";
echo "name = " . $o->name . "\n";
echo "namespaceURI = " . $o->namespaceURI . "\n";
echo "nodeType = " . $o->nodeType . ' - ' . $node_types[$o->nodeType] . "\n";
echo "prefix = " . $o->prefix . "\n";
echo "value = " . $o->value . "\n";
echo "xmlLang = " . $o->xmlLang . "\n";
}
?>
|  |