Enhancing the MATCH_RECOGNIZE Transpiler: NFA
The Project
MATCH_RECOGNIZE is a SQL clause that was first added to the SQL Standard in
2016. It provides an expressive tool for row pattern matching, i.e., finding
sequences of ordered rows that follow a user-defined pattern.
SQL:1999 introduced WITH RECURSIVE—or recursive common table expressions (CTEs)—a
true game changer which turned SQL into a Turing-complete programming language.
Thus, SQL is capable of realizing row pattern matching without the usage of
MATCH_RECOGNIZE.
In the progress of this project, we implemented a transpiler that converts
MATCH_RECOGNIZE code to WITH RECURSIVE code that can be executed by DBMSs
that support window functions and recursive CTEs.
Related theses:
- Parsing and Frontend by Marcel Knüdeler.
- NFA construction by Samuel Heid.
- Translating
MATCH_RECOGNIZEtoWITH RECURSIVEby Tim Findling.
Publication: Lambrecht, Findling, Heid, Knüdeler, Grust. 2025. Democratize MATCH_RECOGNIZE! In Proceedings of 51st Int’l Conference on Very Large Databases (VLDB 2025), September 1-5, 2025, London, UK. PVLDB, Volume 18.
Website: https://db.cs.uni-tuebingen.de/matchrecognize/
This Research Project
This research project aims at enhancing the construction of the NFA. There are several small issues that will
- simplify the generated NFA — especially long chains of states
- therefore boost perfomance and
- add new pattern- and NFA-related features to provide extensive support of the
MATCH_RECOGNIZEconstruct.
Your profile should include:
- Python knowledge.
- Interest in automata, theoretic work and SQL beyond
SELECT FROM WHERE. - Willingness to familiarize yourself with the code base and transpilation process.