How, Where, and Why Data Provenance Improves Query Debugging – A Visual Demonstration of Fine-Grained Provenance Analysis for SQL

Tobias Müller • Pascal Engel

Proceedings of the 38th IEEE Int’l Conference on Data Engineering (ICDE 2022), Kuala Lumpur, Malaysia, May 2022.

Data provenance is meta-information about the origin and processing history of data. We demonstrate the provenance analysis of SQL queries and use it for query debugging. How-provenance determines which query expressions have been relevant for evaluating selected pieces of output data. Likewise, Where- and Why-provenance determine relevant pieces of input data. The combined provenance notions can be explored visually and interactively. We support a feature-rich SQL dialect with correlated subqueries and focus on bag semantics. Our fine-grained provenance analysis derives individual data provenance for table cells and SQL expressions.