Containment of XPath Expressions

Description

The problem is: given two XPath expressions, check whether they are equivalent. A related problem is: check whether the answer of the first expression is a subset of the answer of the second expression.

We have considered the simplest case, when the XPath expressions consists only of tags, *, /, //, and [...]. It is easy to check containment (and, hence, equivalence) if one removes one of the following three constructs: *, //, [...]. But when all three of them are allowed, then both problems are coNP-complete.

This example illustrates why it's a hard problem. The following XPath expression:

/a[.//b[c/*//d]/b[c//d]/b[c/d]]
is contained in this XPath expression:
/a[.//b[c/*//d]/b[c/d]]
however it is not obvious how to check their containment. Please refer to the paper in the Journal of the ACM below for more details.

Given the hardness of the problem we have developed some practical approximation algorithms for checking containment and equivalence. As an application we have developed an XPath vizualization tool, XViz. The code is available here (no guarantees or support are provided).

Supported by:

NSF IIS-0140493

Members

Jim Brinkley,
Shobhit Mathur,
Chris Re,
Dan Suciu,
Bhushan Mandhani,
Todd J. Green,
Ashish Gupta,
Gerome Miklau,
Makoto Onizuka,
Ben Handy,

Publications

Jim Brinkley, Shobhit Mathur, Chris Re, Dan Suciu,
A Framework for XML-based Integration of Data, Visualization and Analysis in a Biomedical Domain
In XSYM, September, 2005
Bhushan Mandhani, Dan Suciu,
Query Caching and View Selection for XML Databases
In VLDB, pp. 469-480, 2005
Todd J. Green, Ashish Gupta, Gerome Miklau, Makoto Onizuka, Dan Suciu,
Processing XML Streams with Deterministic Automata and Stream Indexes
Published in ACM TODS, vol. 29 , no. 4 , pp. 752-788 , December , 2004
Gerome Miklau, Dan Suciu,
Containment and equivalence of a fragment of XPath
Published in Journal of the ACM, vol. 51 , no. 1 , pp. 2-45 , 2004
Gerome Miklau, Dan Suciu,
A Formal Analysis of Information Disclosure in Data Exchange
In SIGMOD, 2004
Ben Handy, Dan Suciu,
XViz: a tool for visualizing XPath expressions
In Proceedings of the XML Database Symposium (SXym), September, 2003
Gerome Miklau, Dan Suciu,
Containment and equivalence of an XPath fragment
In PODS, pp. 65-76, June, 2002