Dear User-Defined Functions, Inlining isn’t working out so great for us. Let’s try batching to make our relationship work. Sincerely, SQL
Kai Franz ☠ • Sam Arch ☠ • Denis Hirn • Torsten Grust • Todd C. Mowry ☠ • Andy Pavlo ☠
(☠ Carnegie Mellon University)
Proceedings of the 14th Conference on Innovative Data Systems Research (CIDR 2024), Chaminade, CA, USA, January 2024.
SQL’s user-defined functions (UDFs) allow developers to express complex computation using procedural logic. But UDFs have been the bane of database management systems (DBMSs) for decades because they inhibit optimization opportunities, potentially slowing down queries significantly. In response, batching and inlining techniques have been proposed to enable effective query optimization of UDF calls within SQL. Inlining is now available in a major commercial DBMS. But the trade-offs between both approaches on modern DBMSs remain unclear.
We evaluate and compare UDF batching and inlining on enterprise and open-source DBMSs using a state-of-the-art UDF-centric workload. We observe the surprising result that although inlining is better on simple UDFs, batching outperforms inlining by up to a factor of 93.4 for more complex UDFs because it makes it easier for a DBMS’s query optimizer to decorrelate subqueries. We propose a hybrid approach that chooses batching or inlining to achieve the best performance.