Bridging the Gap Between Relational and Native XML Storage with Staircase Join

Torsten Grust • Maurice van Keulen • Jens Teubner

Proceedings of the 15th GI Workshop on Foundations of Database Systems, Tangermünde, Germany, June 2003.

Several mapping schemes have recently been proposed to store XML data in relational tables. Relational database systems are readily available and can handle vast amounts of data very efficiently, taking advantage of physical properties that are specific to the relational model, like sortedness or uniqueness. Tables that originate from XML documents, however, carry some further properties that cannot be exploited by current relational query processors. We propose a new join algorithm that is specifically designed to operate on XML data mapped to relational tables. The staircase join is fully aware of the underlying tree properties and allows for I/O and cache optimal query execution. As a local change to the database kernel, it can easily be plugged into any relational database and allows for various optimization strategies, e. g. selection pushdown. Experiments with our prototype, based on the Monet database kernel, have confirmed these statements.