Database Systems

Bringing ORDINALITY to DuckDB

Converting general-purpose data formats such as arrays or csv files into a format that can be used by Relational Database Management Systems is generally not a problem as most systems explicitly support this conversion. Since data in relational formats is generally orderless, information about the original order of elements within these general purpose formats may be lost during conversion. WITH ORDINALITY is a clause defined in the SQL Standard that can be used to save information about the original order by adding an additional column to the result table of certain functions. In this thesis, we implement WITH ORDINALITY for the in-process SQL Database Management System DuckDB. We first find out how WITH ORDINALITY works in PostgreSQL. We then take a look at some examples and precisely define our goal by naming the functions we want to implement WITH ORDINALITY for. We summarize how DuckDB internally operates and get a general overview over the query execution engine. We use this knowledge to start by implementing a rudimentary version of WITH ORDINALITY which we then gradually improve. Finally, we compile our version of DuckDB and put it to the test by executing queries that use the feature and find that we have successfully achieved our goals.

Contact

Tim FischerDenis Hirn