How ‘How’ Explains What ‘What’ Computes — How-Provenance for SQL and Query Compilers

Daniel O'Grady • Tobias Müller • Torsten Grust

10th USENIX Workshop on Theory and Practise of Provenance (TaPP 2018), London, UK, July 2018.

SQL emphasizes the What, the declarative specification of complex computations over a database. How exactly the individual parts of an intricate query interact and contribute to the result, often remains in the dark, however. How-provenance helps to understand queries and build trust in their results. We propose a new approach that derives how-provenance for SQL at a fine granularity: (1) every single piece of the result provides information on how exactly it did get there, and (2) the contribution of any query construct to the overall output can be assessed — from entire subqueries down to the subexpression leaf level. The method applies to real-world dialects of SQL and, more generally, to the modern breed of database systems that pursue a compilation-based approach to query processing.