Bringing Row Pattern Matching to DuckDB: USING KEY Translation Strategy
The Project
MATCH_RECOGNIZE is a SQL clause that was first added to the SQL Standard in
2016. It provides an expressive tool for row pattern matching, i.e., finding
sequences of ordered rows that follow a user-defined pattern.
SQL:1999 introduced WITH RECURSIVE—or recursive common table expressions (CTEs)—a
true game changer which turned SQL into a Turing-complete programming language.
Thus, SQL is capable of realizing row pattern matching without the usage of
MATCH_RECOGNIZE.
In the progress of this project, we implemented a transpiler that converts
MATCH_RECOGNIZE code to WITH RECURSIVE code that can be executed by DBMSs
that support window functions and recursive CTEs.
Related theses:
- Parsing and Frontend by Marcel Knüdeler.
- NFA construction by Samuel Heid.
- Translating
MATCH_RECOGNIZEtoWITH RECURSIVEby Tim Findling.
Publication: Lambrecht, Findling, Heid, Knüdeler, Grust. 2025. Democratize MATCH_RECOGNIZE! In Proceedings of 51st Int’l Conference on Very Large Databases (VLDB 2025), September 1-5, 2025, London, UK. PVLDB, Volume 18.
Website: https://db.cs.uni-tuebingen.de/matchrecognize/
This Thesis
This thesis focusses on a new translation strategy. In previous research, our
chair proposed and
implemented a new
recursive semantic to SQL—USING KEY. Aim of this bachelor thesis is to
implement the translation strategy MATCH_RECOGNIZE to
USING KEY and fully integrate it into the project.
Tasks
- Familiarize oneself with the existing code base: compiler and test setup (docker)
- Integration of a second compiler strategy (USING KEY)
- Revision of existing code parts so that reuse is possible
- Clean separation of the strategies (if necessary, restructuring of existing code)
- Implementation of the USING KEY strategy
- Adding flags to the command line interface to select the strategy and integrate it into the web application
- Testing of the new strategy
- Unit tests
- Integration into the docker test setup that provides a full system test
Requirements
- Python knowledge.
- Interest in SQL beyond
SELECT FROM WHERE. - Willingness to familiarize yourself with two existing code bases.