XPath Expressions in AnyDataset-Xml

This library uses XPath expressions to select nodes and attributes from XML documents. Understanding XPath is essential for effectively using the AnyDataset-Xml library.

Basic XPath Syntax

Here are some common XPath expressions used in the library:

Expression	Description	Example
`element`	Selects all elements with the given name	`"title"` selects all `<title>` elements
`@attribute`	Selects the attribute with the given name	`"@category"` selects the category attribute
`element/@attribute`	Selects an attribute of an element	`"title/@lang"` selects the lang attribute of title elements
`parent/child`	Selects all child elements of the parent	`"book/author"` selects all author elements that are children of book elements
`//element`	Selects all elements with the given name, regardless of position	`"//title"` selects all title elements anywhere in the document
`*`	Selects all elements	`"book/*"` selects all child elements of book
`element[n]`	Selects the nth element	`"author[1]"` selects the first author element
`element[condition]`	Selects elements that satisfy the condition	`"book[@category='WEB']"` selects book elements with category attribute equal to 'WEB'

Implementation Details

Key Points

All field names in the resulting data are converted to lowercase
If an XPath expression doesn't match any nodes, an empty string is returned for that field
If an XPath expression matches multiple nodes, all values are collected automatically in an array

Examples in AnyDataset-Xml

Basic Element Selection

$colNodes = [
    "title" => "title",       // Selects the <title> element
    "author" => "author",     // Selects the <author> element
    "year" => "year",         // Selects the <year> element
    "price" => "price"        // Selects the <price> element
];

Attribute Selection

$colNodes = [
    "category" => "@category",  // Selects the category attribute of the current node
    "lang" => "title/@lang"     // Selects the lang attribute of the title element
];

Using Namespaces

When working with namespaced XML, you need to register the namespaces and use them in your XPath expressions:

$namespace = [
    "atom" => "http://www.w3.org/2005/Atom",
    "gd" => "http://schemas.google.com/g/2005"
];

$colNodes = [
    "id" => "atom:id",             // Selects the id element in the atom namespace
    "email" => "gd:email/@address" // Selects the address attribute of the email element in the gd namespace
];

$dataset = new \ByJG\AnyDataset\Xml\XmlDataset(
    $xml,
    "atom:entry",  // Selects entry elements in the atom namespace
    $colNodes,
    $namespace
);

Advanced Usage

Custom Field Processing

You can use callback functions to process field values:

$colNodes = [
    "title" => "title",
    "lang" => "title/@lang",
    "shortLang" => function ($row) {
        return substr($row->get('lang'), 0, 2);
    }
];

Handling Repeated Nodes

When an XPath expression matches multiple nodes, the values are automatically collected in an array:

// For XML like:
// <book>
//   <author>Author 1</author>
//   <author>Author 2</author>
// </book>

$colNodes = [
    "authors" => "author"  // Will automatically return an array of all author values: ["Author 1", "Author 2"]
];

// Access as:
$authorArray = $row->get('authors');

Accessing Field Values

Case Sensitivity

All field names are converted to lowercase when accessed through the iterator.

$colNodes = [
    "Title" => "title",
    "AUTHOR" => "author"
];

// Access using lowercase:
$title = $row->get('title');  // Not $row->get('Title')
$author = $row->get('author'); // Not $row->get('AUTHOR')

Basic XPath Syntax​

Implementation Details​

Examples in AnyDataset-Xml​

Basic Element Selection​

Attribute Selection​

Using Namespaces​

Advanced Usage​

Custom Field Processing​

Handling Repeated Nodes​

Accessing Field Values​