Pathfinder meets DB2

Manuel Mayr

Ph.D. Workshop of the 11th Int’l Conference on Extending Database Technology (EDBT 2008), Nantes, France, March 2008.

We are taking the next big step towards the goal of a purely relational XQuery implementation. The Pathfinder XQuery compiler has been enhanced by a code generator that emits SQL. This code generator targets off-the-shelf relational database systems (e.g., DB2®) and turns them into efficient and scalable XQuery processors. Our approach neither depends on modifications of the database kernel, nor do we rely on built-in XML-specific functionality (SQL/XML, for instance). For that reason we are able to rest this work on query optimization techniques that have proven their effectiveness for pure SQL workloads.

Here, we will describe (1) how distribution statistics and statistical views can accompany the relational encoding of an XML document to provide information about its node hierarchy, (2) the use of generated columns and materialized query tables to precompute aspects of XPath step evaluation, (3) how the system’s index design wizard can automatically advise on the creation of indexes that, for example, enable index-only XPath location path processing, and (4) optimization profiles, a final fallback that enables fine-grained control over DB2’s query execution plans. Performance ex- periments indicate the potential of the XQuery processor that results from this synthesis of Pathfinder and DB2.