And maybe not on the wall, but instead in your SQLz, eating your data.
But a bit more seriously. Ever since PostgreSQL 8.4 we have window functions, but still I see people which do not know it or are wary to use it.
That's why I decided to write a piece on window functions. How they work and what they can be used for.
Before we start even thinking about window functions, we need some test data. So I made a simple table:
CREATE TABLE people ( id serial PRIMARY KEY, username text, department text, salary int4 );
and filled it with 250 rows based on Polish, random, names, department names from very cool movie, and some random() salaries. Looks like this:
SELECT * FROM people LIMIT 10; id | username | department | salary ----+-----------------+--------------------+-------- 1 | Blanka Pawlak | Psychology | 25200 2 | Fabian Krawczyk | Chem | 47900 3 | Tomasz Górski | Zoology | 39600 4 | Oliwier Woźniak | DATA Archives | 93500 5 | Rafał Olszewski | Maintennance | 77500 6 | Maciej Mazur | Distribution | 57700 7 | Justyna Wróbel | Internal Logistics | 82800 8 | Nikodem Wójcik | Wranglers | 90500 9 | Lena Szewczyk | Electrical | 66400 10 | Filip Kamiński | Engineering | 42300 (10 ROWS)
( full table dump, for your own testing purposes, can be downloaded. )
To write more I have to assume some things:
- You know what grouping is
- You know how grouping works
Window functions, are functions which work on windows. That's a surprise. And what is window? Well, for starters we can assume that windows are more or less the same as “GROUP BY" groups, but defined in such a way that each row can have it's own set of groups.
Let's see simplest possible example, and let's work from there:
SELECT
department,
username,
salary,
COUNT(*) OVER ()
FROM
people
ORDER BY department, username
LIMIT 20;
department | username | salary | COUNT
----------------+------------------------+--------+-------
Accounting | Alan Jaworski | 102900 | 250
Accounting | Aleksander Nowicki | 78900 | 250
Accounting | Anna Nowakowska | 77800 | 250
Accounting | Hubert Zieliński | 27600 | 250
Accounting | Konrad Nowak | 85400 | 250
Accounting | Mateusz Zając | 87600 | 250
Accounting | Michał Olszewski | 86100 | 250
Accounting | Paweł Olszewski | 46700 | 250
Accounting | Rafał Dąbrowski | 46600 | 250
Accounting | Rafał Kowalczyk | 48600 | 250
Accounting | Zofia Szymańska | 95200 | 250
Administration | Antonina Zalewska | 72700 | 250
Administration | Dominik Kowalczyk | 96000 | 250
Administration | Fabian Wieczorek | 28500 | 250
Administration | Jan Walczak | 27400 | 250
Administration | Maksymilian Piotrowski | 89600 | 250
Administration | Nadia Nowak | 109800 | 250
Administration | Stanisław Mazur | 74800 | 250
Archives | Adam Wojciechowski | 51200 | 250
Archives | Aleksandra Zając | 31200 | 250
(20 ROWS)
As you can see I got count of all people in separate column. Of course – this example is not very interesting, as I could have done:
SELECT *, (SELECT COUNT(*) FROM people) FROM people;
and would get more or less the same thing. That's true. Lots of stuff that I'll show with window functions can be done also differently. The point of window functions is that they are easier to write, and they tend to avoid additional table scans.
Anyway. Let's see what exactly was happening. The interesting thing is line 5:
COUNT(*) OVER ()
This is basic window function syntax. There is function call (count(*) in this case), and then there is keyword “OVER" and so called “window definition" in parentheses. In this case window definition is empty, which basically means: treat whole resultset as your “virtual group".
You might notice that I used count(*), which is commonly used aggregate. The great thing is that while there specific window functions, you can also use any aggregate functions (like: min, max, avg, count, sum, array_agg, string_agg and so on) as window functions!
So the expression “count(*) over ()" means – get count of all rows. Simple. But we can do better. Let's say, we'd want to compare given person salary to average salary in her department. Query:
SELECT department, username, salary, avg( salary ) OVER ( partition BY department ) FROM people ORDER BY department, username LIMIT 20; department | username | salary | avg ----------------+------------------------+--------+-------------------- Accounting | Alan Jaworski | 102900 | 71218.181818181818 Accounting | Aleksander Nowicki | 78900 | 71218.181818181818 Accounting | Anna Nowakowska | 77800 | 71218.181818181818 Accounting | Hubert Zieliński | 27600 | 71218.181818181818 Accounting | Konrad Nowak | 85400 | 71218.181818181818 Accounting | Mateusz Zając | 87600 | 71218.181818181818 Accounting | Michał Olszewski | 86100 | 71218.181818181818 Accounting | Paweł Olszewski | 46700 | 71218.181818181818 Accounting | Rafał Dąbrowski | 46600 | 71218.181818181818 Accounting | Rafał Kowalczyk | 48600 | 71218.181818181818 Accounting | Zofia Szymańska | 95200 | 71218.181818181818 Administration | Antonina Zalewska | 72700 | 71257.142857142857 Administration | Dominik Kowalczyk | 96000 | 71257.142857142857 Administration | Fabian Wieczorek | 28500 | 71257.142857142857 Administration | Jan Walczak | 27400 | 71257.142857142857 Administration | Maksymilian Piotrowski | 89600 | 71257.142857142857 Administration | Nadia Nowak | 109800 | 71257.142857142857 Administration | Stanisław Mazur | 74800 | 71257.142857142857 Archives | Adam Wojciechowski | 51200 | 69228.571428571429 Archives | Aleksandra Zając | 31200 | 69228.571428571429 (20 ROWS)
The window function call:
avg( salary ) OVER ( partition BY department )
means that we want average salary for windows, where window, for each row is taken by “grouping" all rows with department being the same as department in row that is being processed.
So, to write it using group by, I could:
SELECT p.department, p.username, p.salary, d.avg FROM people p JOIN ( SELECT department, avg(salary) FROM people GROUP BY department ) AS d USING (department) ORDER BY p.department, p.username LIMIT 20;
The thing is that PostgreSQL does the “join" automatically based on values of columns in row that uses window function (not sure if that's clear).
Within window definition, you can use multiple columns (or expressions) for “PARTITION BY". For example, we could do something like this:
SELECT department, username, salary, COUNT(*) OVER (partition BY department, salary < 50000 ) FROM people ORDER BY department, username LIMIT 20; department | username | salary | COUNT ----------------+------------------------+--------+------- Accounting | Alan Jaworski | 102900 | 7 Accounting | Aleksander Nowicki | 78900 | 7 Accounting | Anna Nowakowska | 77800 | 7 Accounting | Hubert Zieliński | 27600 | 4 Accounting | Konrad Nowak | 85400 | 7 Accounting | Mateusz Zając | 87600 | 7 Accounting | Michał Olszewski | 86100 | 7 Accounting | Paweł Olszewski | 46700 | 4 Accounting | Rafał Dąbrowski | 46600 | 4 Accounting | Rafał Kowalczyk | 48600 | 4 Accounting | Zofia Szymańska | 95200 | 7 Administration | Antonina Zalewska | 72700 | 5 Administration | Dominik Kowalczyk | 96000 | 5 Administration | Fabian Wieczorek | 28500 | 2 Administration | Jan Walczak | 27400 | 2 Administration | Maksymilian Piotrowski | 89600 | 5 Administration | Nadia Nowak | 109800 | 5 Administration | Stanisław Mazur | 74800 | 5 Archives | Adam Wojciechowski | 51200 | 11 Archives | Aleksandra Zając | 31200 | 3 (20 ROWS)
In here, the count is number of people, including current person, that are in the same salary group, where group is defined as < 50000, or >= 50000, but all within the same department.
The other, very important, and commonly used, part of window definition, is sorting. This is something absolutely great.
When dealing with groups, values within the group where basically tossed together in a bag, without any order.
But within windows – rows can be sorted. And the sorting can be used for interesting things:
SELECT department, username, COUNT(*) OVER (partition BY department), COUNT(*) OVER (partition BY department ORDER BY username) FROM people ORDER BY department, username LIMIT 20; department | username | COUNT | COUNT ----------------+------------------------+-------+------- Accounting | Alan Jaworski | 11 | 1 Accounting | Aleksander Nowicki | 11 | 2 Accounting | Anna Nowakowska | 11 | 3 Accounting | Hubert Zieliński | 11 | 4 Accounting | Konrad Nowak | 11 | 5 Accounting | Mateusz Zając | 11 | 6 Accounting | Michał Olszewski | 11 | 7 Accounting | Paweł Olszewski | 11 | 8 Accounting | Rafał Dąbrowski | 11 | 9 Accounting | Rafał Kowalczyk | 11 | 10 Accounting | Zofia Szymańska | 11 | 11 Administration | Antonina Zalewska | 7 | 1 Administration | Dominik Kowalczyk | 7 | 2 Administration | Fabian Wieczorek | 7 | 3 Administration | Jan Walczak | 7 | 4 Administration | Maksymilian Piotrowski | 7 | 5 Administration | Nadia Nowak | 7 | 6 Administration | Stanisław Mazur | 7 | 7 Archives | Adam Wojciechowski | 14 | 1 Archives | Aleksandra Zając | 14 | 2 (20 ROWS)
Please note that adding “order by" to window definition changes scope of window function – it doesn't work on all rows in given partition – it works just on rows from start to given row, based on ordering.
This means that doing cumulative sum gets trivial:
SELECT department, username, salary, SUM(salary) OVER (partition BY department ORDER BY username) FROM people ORDER BY department, username LIMIT 20; department | username | salary | SUM ----------------+------------------------+--------+-------- Accounting | Alan Jaworski | 102900 | 102900 Accounting | Aleksander Nowicki | 78900 | 181800 Accounting | Anna Nowakowska | 77800 | 259600 Accounting | Hubert Zieliński | 27600 | 287200 Accounting | Konrad Nowak | 85400 | 372600 Accounting | Mateusz Zając | 87600 | 460200 Accounting | Michał Olszewski | 86100 | 546300 Accounting | Paweł Olszewski | 46700 | 593000 Accounting | Rafał Dąbrowski | 46600 | 639600 Accounting | Rafał Kowalczyk | 48600 | 688200 Accounting | Zofia Szymańska | 95200 | 783400 Administration | Antonina Zalewska | 72700 | 72700 Administration | Dominik Kowalczyk | 96000 | 168700 Administration | Fabian Wieczorek | 28500 | 197200 Administration | Jan Walczak | 27400 | 224600 Administration | Maksymilian Piotrowski | 89600 | 314200 Administration | Nadia Nowak | 109800 | 424000 Administration | Stanisław Mazur | 74800 | 498800 Archives | Adam Wojciechowski | 51200 | 51200 Archives | Aleksandra Zając | 31200 | 82400 (20 ROWS)
As you can see – adding “ORDER BY" not only makes the rows ordered, but it also, significantly changes what the function is called on – no longer on “all" rows, but just on a subset defined by order, and location of current row.
One important exception is: row_number() – when it is being run over empty window (row_number() over ()), it just works:
SELECT ROW_NUMBER() OVER (), * FROM people LIMIT 20; ROW_NUMBER | id | username | department | salary ------------+----+---------------------+--------------------+-------- 1 | 1 | Blanka Pawlak | Psychology | 25200 2 | 2 | Fabian Krawczyk | Chem | 47900 3 | 3 | Tomasz Górski | Zoology | 39600 4 | 4 | Oliwier Woźniak | DATA Archives | 93500 5 | 5 | Rafał Olszewski | Maintennance | 77500 6 | 6 | Maciej Mazur | Distribution | 57700 7 | 7 | Justyna Wróbel | Internal Logistics | 82800 8 | 8 | Nikodem Wójcik | Wranglers | 90500 9 | 9 | Lena Szewczyk | Electrical | 66400 10 | 10 | Filip Kamiński | Engineering | 42300 11 | 11 | Maciej Kowalczyk | Bio Med | 106700 12 | 12 | Mateusz Zając | Accounting | 87600 13 | 13 | Michał Kozłowski | R + D | 52300 14 | 14 | Rafał Dąbrowski | Accounting | 46600 15 | 15 | Łukasz Kowalski | Distribution | 106200 16 | 16 | Wojciech Kowalski | Operations | 66100 17 | 17 | Paweł Olszewski | Accounting | 46700 18 | 18 | Wojciech Stępień | Communications | 34200 19 | 19 | Nikola Mazur | Demolition | 23200 20 | 20 | Aleksander Zalewski | Communications | 100000 (20 ROWS)
So. Now we know that we can either have functions that run on all rows in given partition (which can be whole result set), or a subset, based on ordering.
In all previous examples ordering within window definition was more or less the same as ordering of resulting recordset. But this doesn't have to be always true. For example, let's check this query:
SELECT department, username, salary, ROW_NUMBER() OVER (partition BY department ORDER BY salary DESC) FROM people ORDER BY department, username LIMIT 20; department | username | salary | ROW_NUMBER ----------------+------------------------+--------+------------ Accounting | Alan Jaworski | 102900 | 1 Accounting | Aleksander Nowicki | 78900 | 6 Accounting | Anna Nowakowska | 77800 | 7 Accounting | Hubert Zieliński | 27600 | 11 Accounting | Konrad Nowak | 85400 | 5 Accounting | Mateusz Zając | 87600 | 3 Accounting | Michał Olszewski | 86100 | 4 Accounting | Paweł Olszewski | 46700 | 9 Accounting | Rafał Dąbrowski | 46600 | 10 Accounting | Rafał Kowalczyk | 48600 | 8 Accounting | Zofia Szymańska | 95200 | 2 Administration | Antonina Zalewska | 72700 | 5 Administration | Dominik Kowalczyk | 96000 | 2 Administration | Fabian Wieczorek | 28500 | 6 Administration | Jan Walczak | 27400 | 7 Administration | Maksymilian Piotrowski | 89600 | 3 Administration | Nadia Nowak | 109800 | 1 Administration | Stanisław Mazur | 74800 | 4 Archives | Adam Wojciechowski | 51200 | 11 Archives | Aleksandra Zając | 31200 | 14 (20 ROWS)
One last thing that is important to know, though it is rarely used, is modification of which rows are visible to the function.
As I showed you – we basically have two modes: all rows within partition (which can be – all rows), and rows within partition that, based on some order, are before given row.
But that's not all. Window definition can also contain so called “frame definition“. If you will ever need to use it – congratulations, that doesn't happen often.
Because of this (how rarely it's needed, at least in the use cases I've seen so far), I want to show you only one specific case of frame definition. In case you're interested in more details – docs will be much better choice.
Let's assume we'd want to have average that is based on values around current row. For example – 2 rows behind to 2 rows ahead.
It is possible to do it using standard window definitions and functions like lag() and lead(), but you can also:
SELECT department, username, salary, avg(salary) OVER ( partition BY department ORDER BY salary ROWS BETWEEN 2 preceding AND 2 following ), array_agg( salary ) OVER ( partition BY department ORDER BY salary ROWS BETWEEN 2 preceding AND 2 following ) FROM people ORDER BY department, salary LIMIT 20; department | username | salary | avg | array_agg ----------------+------------------------+--------+--------------------+---------------------------------- Accounting | Hubert Zieliński | 27600 | 40300.000000000000 | {27600,46600,46700} Accounting | Rafał Dąbrowski | 46600 | 42375.000000000000 | {27600,46600,46700,48600} Accounting | Paweł Olszewski | 46700 | 49460.000000000000 | {27600,46600,46700,48600,77800} Accounting | Rafał Kowalczyk | 48600 | 59720.000000000000 | {46600,46700,48600,77800,78900} Accounting | Anna Nowakowska | 77800 | 67480.000000000000 | {46700,48600,77800,78900,85400} Accounting | Aleksander Nowicki | 78900 | 75360.000000000000 | {48600,77800,78900,85400,86100} Accounting | Konrad Nowak | 85400 | 83160.000000000000 | {77800,78900,85400,86100,87600} Accounting | Michał Olszewski | 86100 | 86640.000000000000 | {78900,85400,86100,87600,95200} Accounting | Mateusz Zając | 87600 | 91440.000000000000 | {85400,86100,87600,95200,102900} Accounting | Zofia Szymańska | 95200 | 92950.000000000000 | {86100,87600,95200,102900} Accounting | Alan Jaworski | 102900 | 95233.333333333333 | {87600,95200,102900} Administration | Jan Walczak | 27400 | 42866.666666666667 | {27400,28500,72700} Administration | Fabian Wieczorek | 28500 | 50850.000000000000 | {27400,28500,72700,74800} Administration | Antonina Zalewska | 72700 | 58600.000000000000 | {27400,28500,72700,74800,89600} Administration | Stanisław Mazur | 74800 | 72320.000000000000 | {28500,72700,74800,89600,96000} Administration | Maksymilian Piotrowski | 89600 | 88580.000000000000 | {72700,74800,89600,96000,109800} Administration | Dominik Kowalczyk | 96000 | 92550.000000000000 | {74800,89600,96000,109800} Administration | Nadia Nowak | 109800 | 98466.666666666667 | {89600,96000,109800} Archives | Aleksandra Zając | 31200 | 36033.333333333333 | {31200,35300,41600} Archives | Paweł Stępień | 35300 | 39825.000000000000 | {31200,35300,41600,51200} (20 ROWS)
Last column shows what values are used for the avg() call in avg column. As you can see it can either have 3, 4, or (usually) 5 rows to work with.
This is because in some cases (at the beginning or end) there are no “preceding" or “following" rows within given partition.
When it comes to syntax, there is one last bit, very simple, to show. In case you're having queries with multiple function calls over the same windows (just like above), you can move window definition to separate location, name it, and use the name instead of whole definition. Like this:
SELECT department, username, salary, avg(salary) OVER my_test_window, array_agg( salary ) OVER my_test_window FROM people window my_test_window AS ( partition BY department ORDER BY salary ROWS BETWEEN 2 preceding AND 2 following ) ORDER BY department, salary LIMIT 20;
That's about it when it comes to general syntax. What about real-life situation, and what can be a surprise?
Let's see:
SELECT department, username, avg( salary ) / 1000 OVER (partition BY department) FROM people ORDER BY department, username LIMIT 20;
Do you see error in above example?
If you'd run it, you would get:
ERROR: syntax error at OR near "over" LINE 4: avg( salary ) / 1000 OVER (partition BY department) ^
Do you know/see/understand why?
What exactly is divided by 1000? Remember that the whole expression is: function() over (window_definition). So if you want to divide it by 1000, you have to put division at the end of expression, not in the middle. So, the proper query would have:
SELECT department, username, avg( salary ) OVER (partition BY department) / 1000 FROM people ORDER BY department, username LIMIT 20;
Sometimes, people on irc or on mailing lists ask why they can't use window functions in where. Like:
SELECT * FROM people WHERE ROW_NUMBER() OVER (partition BY department ORDER BY salary DESC) < 6;
To get the top-5 most earning people per department. Reason is very simple – window functions are calculated after where, group by and having clauses (but before order by/limit). So if you want such query, you have to use subselect:
SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (partition BY department ORDER BY salary DESC) FROM people ) AS x WHERE ROW_NUMBER < 6 ORDER BY department, salary DESC;
All this means that any developer who is using SQL and database engine that supports window functions – should at the very least play a bit with them to see what is possible, and how.
In my previous posts you can find some examples on how to use them, and what they can be used for, and if anything is not clear – please say so in comments, I'll try to update the post to cover all basic and intermediate corners.
In my experience window functions become problematic as soon as the are used in the context of views. The planner does not seem to be able to figure out the optimum query scheme and as a result extreme long running queries occur
Can never get enough of Windowing queries. Great post again Depesz. I am a heavy user of these and yet learned a few more tricks by reading your article.
Thank you for pointing out the `FROM tablename WINDOW window_name AS ()` syntax. I’d missed that in my reading of the documentation and it would’ve been handy to tidy up a couple of queries I’ve done recently.
Window functions have quickly been moving into my “how did I ever do without them” tool-kit. I suspect 9.3’s LATERAL will soon follow.
It’s quite unusual that frames are not more known. I know no better way of writing, for example, cumulative sums (which are used so often in analysis, financial software…). Here’s a good example:
http://stackoverflow.com/questions/11268398/sum-every-3-rows-of-a-table/13140979#13140979
They know window functions well (lag, lead and what not), but in my oppinion it’s the wrong tool for the job. Using frames in this case is a solution to that whole class of problems.
Best regards from me. I follow your blog religiously.
@Milos:
I think the correct term is rolling average. as cumultive sum is simply sum(x) over (order by …).
Sure – in such case frame clauses are the best solution.
Is the avg for Archives correct?
@johnf:
yes, it is – if you think it’s wrong because shown salaries don’t give this average – it’s because only 2 rows from archives are shown, but as you can see in next query, total number of rows for archives is higher, but “limit 20” is hiding them.
Please note – you can download the data that I ran those queries on, and check for yourself.
Great article I learned a lot. A couple of hints for improving it would to offer more explanation of the the window frame that the count(*) applies too in the example where you introduce the ORDER BY caluse.
@Anonymous:
I thought that this example:
select
department,
username,
count(*) over (partition by department),
count(*) over (partition by department order by username)
from people
order by department, username limit 20;
showed the difference good enough. I mean – I can’t figure out what I could say, that this example doesn’t show.