Bringing ORDINALITY to DuckDB
Converting general-purpose data formats such as arrays or csv files into a format
that can be used by Relational Database Management Systems is generally not a problem
as most systems explicitly support this conversion. Since data in relational formats
is generally orderless, information about the original order of elements within
these general purpose formats may be lost during conversion. WITH ORDINALITY is a
clause defined in the SQL Standard that can be used to save information about the
original order by adding an additional column to the result table of certain functions.
In this thesis, we implement WITH ORDINALITY for the in-process SQL Database
Management System DuckDB. We first find out how WITH ORDINALITY works in PostgreSQL.
We then take a look at some examples and precisely define our goal by naming the
functions we want to implement WITH ORDINALITY for. We summarize how DuckDB
internally operates and get a general overview over the query execution engine. We use
this knowledge to start by implementing a rudimentary version of WITH ORDINALITY
which we then gradually improve. Finally, we compile our version of DuckDB and
put it to the test by executing queries that use the feature and find that we have
successfully achieved our goals.