XSLT Time #3: XPathology

| | Comments (0) | TrackBacks (0)

XPath is possibly the single most important tool you’ll ever use within your XSLT programming. The syntax and techniques used in XPath are inherent in the use of XSLT. In a nutshell, XPath is the portion of XSLT that allows you to “travel” through the node structure and select or match certain nodes. In a way, it resembles folder structures that any web developer should be already familiar with.

Here’s an example of some XPath:

  • system-index-block/system-page/title

This is a pretty basic XPath expression. The above XPath translates to the following XML exactly:

<system-index-block>
    <system-page>
    <title />
    </system-page>
</system-index-block>

As you can see, each slash denotes a step down in the XML structure. If that step down doesn’t exist, the XPath can’t perform its travel through the XML. This is why learning XPath is so important. If you can’t handle XPath, you can’t handle XSLT. Syntax

Here is some basic selecting syntax, directly from W3Schools:

  • / - Selects from the root node
  • // - Selects nodes in the document from the current node that match the selection no matter where they are
  • . - Selects the current node
  • .. - Selects the parent of the current node
  • @ - Selects attributes
  • * - the wildcard. This after a slash selects everything underneath the current node.
  • *@ - this does the same as above, except it selects every attribute value underneath the current node.
  • | - the pipe allows you to select on multiple XPath expressions at once.

Filters

All of that is pretty simple. If you were to put //system-block, you’re selecting every system-block below the current node in the entire document. If you simply put /system-block, you’re only selecting the system-blocks who are direct children of the current node. See the difference? Let’s take a look at some expressions with predicates, or what I call filters.

  • system-index-block/system-page[position() = 1] - In this example, we’re still selecting on the system-page node, however with the qualifier being only the first one. Whatever you see in that bracket isn’t being selected. It’s affecting the node before it.
  • //system-page[title = ‘My page’] = Here, we’re selecting every system-page node, below our current node that have a child node of title whose value is ‘My page’. If you’re dealing with strings and not numbers, make sure to enclose that value in single quotes.

Hierarchy

Remember discussing XML hierarchy in the first entry? Well, it’s important that you do because you can use it in your XPath to select nodes. These are called axes and, from W3Schools, here they are:

  • ancestor - Selects all ancestors (parent, grandparent, etc.) of the current node
  • ancestor-or-self - Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself
  • attribute - Selects all attributes of the current node
  • child - Selects all children of the current node
  • descendant - Selects all descendants (children, grandchildren, etc.) of the current node
  • descendant-or-self - Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself
  • following - Selects everything in the document after the closing tag of the current node
  • following-sibling - Selects all siblings after the current node
  • namespace - Selects all namespace nodes of the current node
  • parent - Selects the parent of the current node
  • preceding - Selects everything in the document that is before the start tag of the current node
  • preceding-sibling - Selects all siblings before the current node
  • self - Selects the current node

Now a lot of those are more usable than others; ancestor, preceding- and following-sibling an descendant or used quite a bit. The others are used, but you’ll see that axes aren’t always your best tool. Here’s a quick example:

  • title/ancestor::system-page - this expression, starting at a title node then proceeds to grab system-page that is an ancestor of the title.

So that’s that. Of course, that’s actually not all that. XPath is a wonderful art to learn that takes time to perfect. You’ll find that there are many, many different ways to select or match the same thing. There are, of course quicker and more succinct ways of doing things and you’ll no doubt see other ways of doing things I see. A rule of thumb: write your expressions as small as possible as long as you’re not eliminating loopholes in your code. Loopholes are single instances of XML that betray your expressions when you least expect it. These are usually taken care of with predicates (filters)…or functions; but that’s for a later post.

0 TrackBacks

Listed below are links to blogs that reference this entry: XSLT Time #3: XPathology.

TrackBack URL for this entry: http://webcom.missouri.edu/mt/mt-tb.cgi/76

Leave a comment

Note: Comments are moderated. If published, comments may be edited for length, style and clarity.