How ‘How’ Explains What ‘What’ Computes — How-Provenance for SQL and Query Compilers

Daniel O'GradyTobias MüllerTorsten Grust

10th USENIX Workshop on Theory and Practise of Provenance (TaPP 2018), London, UK, July 2018.

SQL emphasizes the What, the declarative specification of complex computations over a database. How exactly the individual parts of an intricate query interact and contribute to the result, often remains in the dark, however. How-provenance helps to understand queries and build trust in their results. We propose a new approach that derives how-provenance for SQL at a fine granularity: (1) every single piece of the result provides information on how exactly it did get there, and (2) the contribution of any query construct to the overall output can be assessed — from entire subqueries down to the subexpression leaf level. The method applies to real-world dialects of SQL and, more generally, to the modern breed of database systems that pursue a compilation-based approach to query processing.