SQL is a Programming Language

Seminar INFO4663

Readers: Torsten Grust • Denis Hirn • Louisa Lambrecht

Prerequisites: Advanced SQL
Participants: 7

SQL is the query language for the relational data model, which can be used to merge, group, and filter data from different tables. But SQL can do more than that. SQL’s history dates back to the 1970s, and since then the language has steadily evolved, transformed, and extended. Today’s modern SQL is declarative, expressive, often very compact and also elegant. Since the introduction of recursion in SQL, the language is also computationally complete: there is no problem that cannot be cracked with SQL.

In this (pro-)seminar, participants each work on a self-contained programming problem and construct a solution exclusively in SQL (PL/SQL is not allowed). We advise using PostgreSQL, but this can be arranged individually. For this, we choose problems that lead to (1) an interesting algorithm, (2) solve puzzles of different types, (3) perform simulations, (4) can be visualized in an appealing way, etc. fun and eureka moments are in the foreground.

In any case, the problems will not be “typical relational” and will not lead to SELECT * FROM table;: SQL has a lot more to offer and in this seminar you will provide numerous proofs.

In a 25-minute presentation, each participant will talk about the problem he/she has worked on and present a SQL-based solution. Short demos during the presentation are ideal.

If you encounter a problem “hitch” in solving/implementing the problem, this is not a show stopper: you will be given tips on how to do this – no one has to fail this seminar because of this. Your slide material and the paper of the problem and its solution will be in English.

Preliminary Meeting

In the preliminary meeting we will talk about

the possible programming problems and
further dates in the semester.

The meeting will be held during the first week of the semester: Friday, October 21 at 11:00, room B305.1 (Sand 13).

If you have any questions, feel free to contact Denis Hirn .

Presentation Dates

There are two talks per date and the sessions usually take about 90 minutes. The talks should consist of at least:

the problem description,
an explanation of the implementation
and of a live demo.

A live demo with visualization is helpful in order to present the results clearly and visually appealing.

There is a time limit of 25 minutes per talk. Do not overrun. Afterwards there will be a short Q&A session of about 5 minutes. Prepare your code editor to be able to answer questions about the code.

Attendance to all sessions is mandatory as this is a seminar.

Notes on the Presentations

Important: You should make an appointment with your supervisor at least one to two weeks before your presentation to discuss your complete set of slides. It is the rule (not the exception) that this results in changes to the slides. So please make sure that you allow yourself a sufficiently large time window for necessary changes.
The length of the actual presentations should be between 25 and 30 minutes. The presentations are followed by a short open discussion, which can refer to the content of the presentation but also to the presentation itself (slides, language).
You will find further valuable information on the structure of a presentation, the layout of slides and holding the presentation itself on the following pages:

Simon Peyton-Jones (Microsoft Research, Cambridge): How to give a great research talk
Andreas Zeller (U Saarland): Der perfekte Seminarvortrag

The language of the slides should be English. It is up to you whether you give the presentation in German or in English.
The content of the presentation should explain the problem/algorithm in detail and perhaps demonstrate it using 1-2 examples. Emphasize what makes the problem interesting and/or tricky. You should spend the rest of the presentation (1) discussing possible solutions (there can be several conceivable solutions) and (2) presenting the realization of the solution you have chosen and implemented.
Your program to solve the problem will typically contain an algorithmic core, often a few lines. You can and should show and explain this core, possibly shortened and simplified if appropriate, during the presentation.
The actual demonstration of your executable program is also part of a successful presentation. Last but not least, it can be very motivating for your listeners. Here, too, the following applies: use small examples that show the common case and do not deal with the perhaps more esoteric edge cases of the problem and its solution.
In general: convey the principle of your solution, especially SQL-specific features — a complete presentation of the code is not appropriate. Focus on the parts that are interesting. If the path to the solution (not the solution itself) was systematic and included an eye opener, then that can greatly help build understanding. However, the rather random presentation of ‘‘I tried it, but then it went wrong’’-attempts tends to confuse.
To create your slides, we can recommend the LaTeX-Class Beamer.

Seminar Papers

The paper of your programming topic must be written with LaTeX. For this purpose the two-column SIGCONF Proceedings template must be used. You can use the original template or this reduced version.
For figures, you can use TikZ.
The paper must have a minimum of 4 and a maximum of 6 pages.

Deadline for submissions is March 31, 2023. Please send by e-mail to your supervisor. In addition, the code and slides should be handed in.

Topics

Below you will find a list of the tasks that will be realized in this seminar using SQL. You will receive the exact details (e.g. which format the inputs are) and hints for the implementation from your supervisor.

Visibility in 3D-Terrain
Which points of a hilly terrain are visible from a given viewer position? While nearby hills block the view of points behind them, higher peaks may still be visible, even if they are far away. This visibility problem can be solved with so-called max scans, which you will use in this task.
Markov Decision Process Visualization
Given a 2D grid in which a robot has been randomly placed. The robot can now decide to move in one of four directions. However, it is possible that the robot malfunctions and moves in a completely different direction despite its decision. Additionally the robot receives or loses points with each step, depending on which field it lands on. The goal of the robot is to maximize its score. The Markov Decision Process calculates a policy that allows the robot to decide exactly which direction to move next, in order to to maximize its score (reward).
Create 2D-Maps from Fragments (Merging Maps)
Satellite images of the Earth capture only a small window of the surface at a time. How can larger maps of the surface be consistently created from partial images (see, for example, Google Maps)? This task will work on a textual abstract version of such images — but the problem remains exciting and can be visualized in a classy way.
Generation of Pixel Graphics from Braille Characters
The Unicdode character set contains all Braille characters, which were originally designed to give blind people access to printed materials by tactile recognition. These characters arrange 2 × 4 pixels rectangularly (examples of braille characters are ⠝ ⡞ ⡿ ⣠ ⣿). Since Unicode contains all 2⁸ pixel combinations as characters, you can represent any pixel graphics in plain text form this way, see also the Drawille project. This seminar topic replicates this conversion of pixel graphics into Braille text form in SQL.
Distance-Vector Routing
If the connections in a network of hosts differ strongly (in terms of throughput, performance), Distance-Vector Routing can still efficiently compute routes of good quality. The hosts use the so-called Distance-Vector Routing tables. This seminar topic accomplishes the construction (and maybe even dynamic updates?) of these routing tables in SQL and based on this calculates efficient routes through a network based on them. berechnet darauf basierend effiziente Routen durch ein Netzwerk.
Levenshtein-Distance
How do we quantify the difference between any two strings s and t? The Levenshtein distance calculates a distance that allows us to determine what changes are needed to the string s to transform it into the string t.
Pathfinding with the A*-Algorithm in a 2D-Grid
To find the shortest path between two points in a timely manner on a 2D grid, there are many ways. One possible way is to calculate this path with the help of the A*-algorithm. Your task is to to implement this algorithm.
Text Pattern Recognition using Finite Automata
When analyzing character strings, it can be interesting to see whether they correspond to a certain pattern (for example: is the word “analyse” in another variant, “analyze”, in a text?). Such a pattern recognition can be determined by means of finite automata, which can be represented in SQL in different ways.
Dynamic Time Warping
DTW allows to detect similarities between two time measurements. The larger the DTW value, the more different the data. For example, if a slow runner and a fast runner were to run the same path, measuring different elevation positions at the same times, DTW would still recognize that these two measurements are very similar.
Handwriting Recognition
Letters are drawn in handwriting on a tablet. How can the corresponding letters be determined from the resulting lines? A simple procedure is to simplify them to such an degree that certain characteristics can be determined and compared with known letter patterns. This task impressively demonstrates how, with the help of simple methods, handwriting and other drawings can be recognized and used as input for computer programs.