XML, XPATH and XQuery – Part 1

Playing with XPATH and XQuery on XML Data

This tutorial walks you through the XPATH and XQuery for querying the XML Data.

As we all know that the XML is the most convenient because of its self-describing nature and simple flat file document nature.

Every platform supports the XML files and most significant reason being is that its a machine readable due to its self-describing tags and parent child hierarchical structure. There are  many other advantages compared to the many data file formats available.

All these advantages made the XML format a champ and it quickly found its place almost everywhere from storing data , transactions to system configurations.


why XPATH is needed ?

When we have a very large XML document ( for instance, an invoice XML with lot of items with all the order information) , we do come across a situation where  our application might need to process this Invoice XML and  may need to traverse the entire document from both top-to-bottom or bottom-to-Top for final values such as invoice total, tax etc.

XPATH comes handy in these kind of situations. XPATH is a w3c recommendation and the latest specification being is XPATH 2.0.


what is XPATH ? 

XPATH is a query language that helps us to select the nodes from XL documents by using the expressions. As XML document contains all the nodes in hierarchical  ( parent – child) format with a root node, XPATH relies on this tree-like structure to navigate the entire XML document using the expressions and its standard library functions.

Another best thing to note here is this XPATH language is designed to be used within a host language ( any programming language that supports XML processing ) and this is the reason XPATH found its place even in XSLT ( till latest 2.0 version) and XQuery ( 1.0 at the time this post was written). Also same XPATH has been used as a basis for IBM Integration Bus/message broker proprietary language ESQL.


XPATH terminology Jargon :  

XPATH considers an XML document as a node tree with a parent-child relationship. A node could be a tag node, element node and a text node.

Below figure shows the nodes , atomic values and Items.


Items – item refers to either node or atomic value irrespective if whether it has parent or child nodes.

Node – node can be a tag node, element node or a text node.

Atomic values – That has no parent or child nodes.








Roles played by XML Nodes  :

Now lets look at the relationship in a node tree.

Nodes can play one or more relationship roles in an XML document. Below figure depicts the various roles played by each node.

  •  Parent              :    Parent element is the one that has one or more children
  •  Child                :    children are nodes that has a parent node.  
  •  Sibling              :    Siblings are elements that has the same parents node.
  •  Ancestor          :   an ancestor of element is its parent and grand parent element nodes.
  •  Descendant    :    Descendants of an element are children and grand children of an element node.


Lets discuss the XPATH 2.0 Properties in next section XML,XPATH & XQuery – part 2.

Written by Ramesh Metta

Leave a Reply

Your email address will not be published. Required fields are marked *