
Recently I was scrolling (not-mindlessly) videos on Youtube, when I stumbled upon an SQL coding problem. It wasn't anything super complicated, and I've seen similar problems on coding platforms like LeetCode before. But for some reason, this was the first time it had clicked with me. I was like, "Oh, I totally get this!" So I decided to give it a shot and write a blog post about it. And guess what? I realized that SQL window functions are actually pretty darn useful!
In this blog, we will examine some of the SQL window functions available within ABAP-SQL and delve into their application through the use of examples.
So, here's the problem: We have a table called ZMOVIE_SEATS with two columns, SEAT_ID (Integer) and STATUS (Boolean). SEAT_ID is an auto increment primary key (this column will have values like 1,2,3 … so on). Our goal is to find out which consecutive seat ids are occupied (or unoccupied which is simple flip in the solution). This problem is also known as the "Consecutive Available Seat Problem" The trick is that we need to solve it using SQL alone. That's what caught my attention. It's easy enough to solve using programming constructs like loops and conditionals, but doing it with SQL alone is what makes it interesting according to me.
Definitely, there could be multiple variations of the solutions, but the one I intend to present here is to utilize window functions, namely LEAD and LAG, while concurrently introducing the concept of window functions to readers who may not yet be acquainted with it.
Sample Input Table Data:
Expected Output: 3, 4, 5
Before we get to solving it using window functions, lets learn what are the '"window functions"
Window functions in SQL are a powerful tool for performing calculations across a set of rows that are related to the current row. They provide the ability to perform aggregate-like operations (e.g., SUM, COUNT, AVG) or ranking calculations (e.g., RANK, ROW_NUMBER) without actually grouping the result set into fewer rows. This means that each row maintains its individual identity while still incorporating information from its surrounding "window" of rows.
Now, let's look at definitions of some key terms and concepts:
Types of Window Functions
Why Use Window Functions?
Sample solution for the Consecutive Available Seat Problem:
TYPES: BEGIN OF ty_movie_seat,
seat_id TYPE i,
free TYPE abap_bool,
END OF ty_movie_seat,
tty_movie_seats TYPE STANDARD TABLE OF ty_movie_seat WITH NON-UNIQUE KEY seat_id.
DATA(movie_seats) = VALUE tty_movie_seats( ( seat_id = 1 free = abap_true )
( seat_id = 2 free = abap_false )
( seat_id = 3 free = abap_true )
( seat_id = 4 free = abap_true )
( seat_id = 5 free = abap_true ) ).
SELECT seat_id AS seat_id,
free AS current_free,
LAG( free ) OVER( ORDER BY seat_id ) AS previous_free,
LEAD( free ) OVER( ORDER BY seat_id ) AS next_free
FROM movie_seats AS movie_seats
ORDER BY seat_id
INTO TABLE (lt_tab).
SELECT seat_id FROM lt_tab AS data1
WHERE current_free = @abap_true
AND ( previous_free = @abap_true OR next_free = @abap_true )
INTO TABLE (lt_output).
cl_demo_output=>display_data( value = lt_output ).
Program Output:
The window functions LEAD and LAG are designed to give you access to data from rows other than the current row you're processing within a result set. They operate within a specified window (a subset of rows) and help you analyze data in relation to its neighboring rows.
These have the following syntax:
LEAD|LAG( sql_exp1[, diff[, sql_exp2]]
diff determines the number of rows to either look forward or backward for LEAD and LAG respectively. If diff is not specified a value of 1 is assumed implicitly.If the row determined by diff is not in the current window, the result is the null value by default.
OVER Clause is essential for both LEAD and LAG. It defines the window (set of rows) within which the functions operate. The ORDER BY clause within OVER is crucial to establish the sequence of rows.
The below illustrations should be helpful in understanding the concept:
LAG:
LEAD:
Since we are on the topic of window functions, lets also explore two more window functions namely FIRST_VALUE and LAST_VALUE with an example:
Imagine we have a logging table with below entries:
Now, if we are tasked to identify the first and the last log entry for every user per day then we achieve it by using the above introduced window functions as shown below:
TYPES: BEGIN OF lty_entry_log,
entry_id TYPE char5,
time TYPE tims,
date TYPE date,
user_id TYPE char10,
END OF lty_entry_log,
ltt_entry_log TYPE STANDARD TABLE OF lty_entry_log WITH NON-UNIQUE KEY entry_id.
DATA lt_entry_log TYPE ltt_entry_log.
lt_entry_log = VALUE #( ( entry_id = '10001' time = '084534' date = '20231010' user_id = 'mark' )
( entry_id = '10002' time = '093012' date = '20231010' user_id = 'mark' )
( entry_id = '10003' time = '111547' date = '20231010' user_id = 'john' )
( entry_id = '10004' time = '123022' date = '20231010' user_id = 'john' )
( entry_id = '10005' time = '142551' date = '20231010' user_id = 'mark' )
( entry_id = '10006' time = '083609' date = '20231011' user_id = 'john' )
( entry_id = '10007' time = '092841' date = '20231011' user_id = 'mark' )
( entry_id = '10008' time = '113012' date = '20231011' user_id = 'mark' )
( entry_id = '10009' time = '121855' date = '20231011' user_id = 'mark' )
( entry_id = '10010' time = '143639' date = '20231011' user_id = 'john' ) ).
SELECT DISTINCT user_id,
date,
FIRST_VALUE( time ) OVER( PARTITION BY user_id, date ORDER BY date ) AS first_entry,
LAST_VALUE( time ) OVER( PARTITION BY user_id, date ORDER BY date ) AS last_entry
FROM lt_entry_log AS log_entry
ORDER BY date
INTO TABLE (lt_result).
cl_demo_output=>display_data( lt_result ).
Upon, execution we get the following output which is as expected:
Consider a window, which is a subset of rows from a query result. These functions offer a glimpse into values within that window:
The window is specified using an OVER clause. Similar to LEAD and LAG, FIRST_VALUE and LAST_VALUE require this clause. It's crucial to specify the ordering of the rows within the window using ORDER BY to determine the "first" or "last" row.
OVER Clause: The OVER clause defines the window. It shares similarities with LEAD and LAG. ORDER BY within the OVER clause is essential in determining which row is considered the "first" or "last".
PARTITION BY: While optional, PARTITION BY is useful for dividing data into groups.
Window functions offer a lot more flexibility to do code push down and learning and understanding them will definitely help a developer write efficient SQL!
Thanks for reading! Hope you found the topics covered in this blog interesting and worth exploring further.
To read more about the window functions available in ABAP SQL then you can explore the standard SAP documentation using the link: https://help.sap.com/doc/abapdocu_latest_index_htm/latest/en-US/index.htm?file=abapselect_over.htm
You can also look at sample code within the following ABAP classes to learn more:
And .. in case you have alternative solutions to the problems discussed in this blog, do share them in the comment sections, it would be really cool to know different techniques that can be applied!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
6 | |
5 | |
5 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 | |
2 |