Bringing Flummi to umbra.trampoline
In modern database systems, iterative and recursive calculations in Python or C that use embedded Structured Query Language (SQL) queries often lead to performance issues due to the repeated entering and exiting of the database environment. This thesis addresses this problem by extending the ”Flummi” compiler developed at the Database Systems Research Group from University of Tübingen, which can translate iterative and recursive input programs into pure SQL files. The focus of this work was on adapting the compiler to generate output compatible with the trampoline feature available in the Umbra Database Management System (DBMS) and compiling it accordingly. This feature utilizes the principle of trampolining, inspired by functional languages such as Lisp or Haskell, which transforms iterative and recursive programs into loops with discrete calculations, enabling better parallelization and thus potentially significant performance gains.
A central aspect was the direct comparison of performance between the
umbra.trampoline feature and the traditional implementation of iterative/recursive
queries using Common Table Expressions (CTEs) in Umbra. The experiments showed
that the trampoline feature offers significant advantages over CTEs, particularly in
simpler and less complex recursive queries, by significantly reducing execution time
and enabling more efficient parallelization. However, in more complex queries and
larger datasets, the CTE-based implementation occasionally exhibited better performance,
suggesting that the optimization potential of the trampoline feature is not yet fully realized.
Additionally, Umbra was compared with the DuckDB database system, where Umbra
outperformed DuckDB in all test scenarios, especially in data-intensive tasks.
These results highlight the efficiency of Umbra and the specific advantages of the
trampoline approach, particularly in highly parallelized environments.