Dependable Cardinality Forecasts for XQuery

Jens Teubner • Torsten Grust • Sebastian Maneth • Sherif Sakr

Proceedings of the 34th Int’l Conference on Very Large Databases (VLDB 2008)/Journal of Data Management Research (JDMR), vol. 1, Auckland, New Zealand, August 2008.

Though inevitable for effective cost-based query rewriting, the derivation of meaningful cardinality estimates has remained a notoriously hard problem in the context of XQuery. By basing the estimation on a relational representation of the XQuery syntax, we show how existing cardinality estimation techniques for XPath and proven relational estimation machinery can play together to yield dependable forecasts for arbitrary (sub)expressions. Our approach benefits from a light-weight form of data flow analysis. Abstract domain identifiers guide our query analyzer through the estimation process and allow for informed decsions even in case of deeply nested XQuery expressions. A variant of projection paths provides a versatile interface into which existing techniques for XPath cardinality estimation can be plugged in seamlessly. We demonstrate an implementation of this interface based on data guides. Experiments show how our approach can equally cope with both, structure- and value-based queries. It is robust with respect to intermediate estimation errors, from which we typically found our implementation to recover gracefully.