Couple of days ago I had a problem that I couldn't solve after ~ 2 hours, and decided to ask on IRC. Almost immediately after asking, I figured out the solution, but David asked me to write about the solution, even though it's now (for me) completely obvious.
The problem was like this:
I had two tables, with very simple structure: event_when timestamptz, event_count int4, and wanted to show it as a single recordset with columns: event_when, event_count_a, event_count_b, but the problem was that event_when usually didn't match. Here is an example:
$ CREATE TABLE events_a (event_when timestamptz, event_count int4); CREATE $ CREATE TABLE events_b (event_when timestamptz, event_count int4); CREATE $ INSERT INTO events_a (event_when, event_count) SELECT now() - '10 minutes'::INTERVAL * random(), random() * 10 FROM generate_series(1,20); INSERT 0 20 $ INSERT INTO events_b (event_when, event_count) SELECT now() - '10 minutes'::INTERVAL * random(), random() * 10 FROM generate_series(1,10); INSERT 0 10
Of course tables that I had problem with were much larger, but this example is good enough to show the problem.
Data might (depending on when you'll run it, and your random number generator) look like this:
$ SELECT * FROM events_a; event_when │ event_count ───────────────────────────────┼───────────── 2012-08-27 23:52:03.252017+02 │ 9 2012-08-27 23:59:16.886855+02 │ 8 2012-08-27 23:52:58.926927+02 │ 6 2012-08-27 23:53:36.57738+02 │ 6 2012-08-27 23:54:58.458541+02 │ 1 2012-08-27 23:51:30.54373+02 │ 3 2012-08-27 23:52:53.981979+02 │ 10 2012-08-27 23:51:56.947008+02 │ 9 2012-08-27 23:53:42.443812+02 │ 9 2012-08-28 00:00:13.417665+02 │ 2 2012-08-27 23:57:51.333133+02 │ 4 2012-08-27 23:56:15.77672+02 │ 9 2012-08-27 23:56:22.940718+02 │ 3 2012-08-28 00:00:50.668566+02 │ 5 2012-08-27 23:59:56.927112+02 │ 9 2012-08-27 23:58:32.724896+02 │ 1 2012-08-27 23:52:54.833586+02 │ 5 2012-08-27 23:52:29.364645+02 │ 7 2012-08-28 00:00:37.243976+02 │ 6 2012-08-27 23:58:55.074473+02 │ 7 (20 ROWS) $ SELECT * FROM events_b; event_when │ event_count ───────────────────────────────┼───────────── 2012-08-27 23:54:14.985946+02 │ 2 2012-08-28 00:01:22.990129+02 │ 6 2012-08-27 23:59:23.374828+02 │ 9 2012-08-27 23:56:24.995275+02 │ 10 2012-08-27 23:52:59.19678+02 │ 6 2012-08-27 23:59:30.517138+02 │ 2 2012-08-27 23:51:43.406024+02 │ 7 2012-08-28 00:00:06.504544+02 │ 5 2012-08-27 23:51:34.098968+02 │ 2 2012-08-28 00:01:22.487454+02 │ 1 (10 ROWS)
New records are inserted in events tables when count changes, and due to how it happens the changes are usually pretty random.
It is possible that we will have records with the same event_when in both tables, but it's not really likely, but within single table, event_when is unique.
So. Simplest possible approach:
$ SELECT COALESCE(a.event_when, b.event_when) AS event_when, a.event_count AS count_a, b.event_count AS count_b FROM events_a a FULL OUTER JOIN events_b b USING ( event_when ) ORDER BY event_when;
Produces sane, but not satisfactory, result:
event_when │ count_a │ count_b ───────────────────────────────┼─────────┼───────── 2012-08-27 23:51:30.54373+02 │ 3 │ [NULL] 2012-08-27 23:51:34.098968+02 │ [NULL] │ 2 2012-08-27 23:51:43.406024+02 │ [NULL] │ 7 2012-08-27 23:51:56.947008+02 │ 9 │ [NULL] 2012-08-27 23:52:03.252017+02 │ 9 │ [NULL] 2012-08-27 23:52:29.364645+02 │ 7 │ [NULL] 2012-08-27 23:52:53.981979+02 │ 10 │ [NULL] 2012-08-27 23:52:54.833586+02 │ 5 │ [NULL] 2012-08-27 23:52:58.926927+02 │ 6 │ [NULL] 2012-08-27 23:52:59.19678+02 │ [NULL] │ 6 2012-08-27 23:53:36.57738+02 │ 6 │ [NULL] 2012-08-27 23:53:42.443812+02 │ 9 │ [NULL] 2012-08-27 23:54:14.985946+02 │ [NULL] │ 2 2012-08-27 23:54:58.458541+02 │ 1 │ [NULL] 2012-08-27 23:56:15.77672+02 │ 9 │ [NULL] 2012-08-27 23:56:22.940718+02 │ 3 │ [NULL] 2012-08-27 23:56:24.995275+02 │ [NULL] │ 10 2012-08-27 23:57:51.333133+02 │ 4 │ [NULL] 2012-08-27 23:58:32.724896+02 │ 1 │ [NULL] 2012-08-27 23:58:55.074473+02 │ 7 │ [NULL] 2012-08-27 23:59:16.886855+02 │ 8 │ [NULL] 2012-08-27 23:59:23.374828+02 │ [NULL] │ 9 2012-08-27 23:59:30.517138+02 │ [NULL] │ 2 2012-08-27 23:59:56.927112+02 │ 9 │ [NULL] 2012-08-28 00:00:06.504544+02 │ [NULL] │ 5 2012-08-28 00:00:13.417665+02 │ 2 │ [NULL] 2012-08-28 00:00:37.243976+02 │ 6 │ [NULL] 2012-08-28 00:00:50.668566+02 │ 5 │ [NULL] 2012-08-28 00:01:22.487454+02 │ [NULL] │ 1 2012-08-28 00:01:22.990129+02 │ [NULL] │ 6 (30 ROWS)
(my psql shows NULL values as [null] string, and not empty one).
Since records happen on change, for the whole time between records value is as in last record before.
For example – let's look at count_a – in records #2 and #3 (23:51:34.098968+02 and 23:51:43.406024+02) it's value should be 3 because it is last known event_count.
Of course we don't know value for count_b in the first record, but that's OK – we just should fill whatever is posible.
So. I spent some time on this problem, and couldn't find solution. But then, after describing it on IRC, I realized that it was closely related to this problem.
With this realization, solution became obvious. First, I'll add new columns to data, that shows event_when if there are data for given count_ column. Easier to show than explain:
$ WITH base_data AS ( SELECT COALESCE(a.event_when, b.event_when) AS event_when, a.event_count AS count_a, b.event_count AS count_b FROM events_a a FULL OUTER JOIN events_b b USING ( event_when ) ) , with_group_starts AS ( SELECT *, CASE WHEN count_a IS NULL THEN NULL ELSE event_when END AS group_a, CASE WHEN count_b IS NULL THEN NULL ELSE event_when END AS group_b FROM base_data ) SELECT * FROM with_group_starts ORDER BY event_when; event_when │ count_a │ count_b │ group_a │ group_b ───────────────────────────────┼─────────┼─────────┼───────────────────────────────┼─────────────────────────────── 2012-08-27 23:51:30.54373+02 │ 3 │ [NULL] │ 2012-08-27 23:51:30.54373+02 │ [NULL] 2012-08-27 23:51:34.098968+02 │ [NULL] │ 2 │ [NULL] │ 2012-08-27 23:51:34.098968+02 2012-08-27 23:51:43.406024+02 │ [NULL] │ 7 │ [NULL] │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:51:56.947008+02 │ 9 │ [NULL] │ 2012-08-27 23:51:56.947008+02 │ [NULL] 2012-08-27 23:52:03.252017+02 │ 9 │ [NULL] │ 2012-08-27 23:52:03.252017+02 │ [NULL] 2012-08-27 23:52:29.364645+02 │ 7 │ [NULL] │ 2012-08-27 23:52:29.364645+02 │ [NULL] 2012-08-27 23:52:53.981979+02 │ 10 │ [NULL] │ 2012-08-27 23:52:53.981979+02 │ [NULL] 2012-08-27 23:52:54.833586+02 │ 5 │ [NULL] │ 2012-08-27 23:52:54.833586+02 │ [NULL] 2012-08-27 23:52:58.926927+02 │ 6 │ [NULL] │ 2012-08-27 23:52:58.926927+02 │ [NULL] 2012-08-27 23:52:59.19678+02 │ [NULL] │ 6 │ [NULL] │ 2012-08-27 23:52:59.19678+02 2012-08-27 23:53:36.57738+02 │ 6 │ [NULL] │ 2012-08-27 23:53:36.57738+02 │ [NULL] 2012-08-27 23:53:42.443812+02 │ 9 │ [NULL] │ 2012-08-27 23:53:42.443812+02 │ [NULL] 2012-08-27 23:54:14.985946+02 │ [NULL] │ 2 │ [NULL] │ 2012-08-27 23:54:14.985946+02 2012-08-27 23:54:58.458541+02 │ 1 │ [NULL] │ 2012-08-27 23:54:58.458541+02 │ [NULL] 2012-08-27 23:56:15.77672+02 │ 9 │ [NULL] │ 2012-08-27 23:56:15.77672+02 │ [NULL] 2012-08-27 23:56:22.940718+02 │ 3 │ [NULL] │ 2012-08-27 23:56:22.940718+02 │ [NULL] 2012-08-27 23:56:24.995275+02 │ [NULL] │ 10 │ [NULL] │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:57:51.333133+02 │ 4 │ [NULL] │ 2012-08-27 23:57:51.333133+02 │ [NULL] 2012-08-27 23:58:32.724896+02 │ 1 │ [NULL] │ 2012-08-27 23:58:32.724896+02 │ [NULL] 2012-08-27 23:58:55.074473+02 │ 7 │ [NULL] │ 2012-08-27 23:58:55.074473+02 │ [NULL] 2012-08-27 23:59:16.886855+02 │ 8 │ [NULL] │ 2012-08-27 23:59:16.886855+02 │ [NULL] 2012-08-27 23:59:23.374828+02 │ [NULL] │ 9 │ [NULL] │ 2012-08-27 23:59:23.374828+02 2012-08-27 23:59:30.517138+02 │ [NULL] │ 2 │ [NULL] │ 2012-08-27 23:59:30.517138+02 2012-08-27 23:59:56.927112+02 │ 9 │ [NULL] │ 2012-08-27 23:59:56.927112+02 │ [NULL] 2012-08-28 00:00:06.504544+02 │ [NULL] │ 5 │ [NULL] │ 2012-08-28 00:00:06.504544+02 2012-08-28 00:00:13.417665+02 │ 2 │ [NULL] │ 2012-08-28 00:00:13.417665+02 │ [NULL] 2012-08-28 00:00:37.243976+02 │ 6 │ [NULL] │ 2012-08-28 00:00:37.243976+02 │ [NULL] 2012-08-28 00:00:50.668566+02 │ 5 │ [NULL] │ 2012-08-28 00:00:50.668566+02 │ [NULL] 2012-08-28 00:01:22.487454+02 │ [NULL] │ 1 │ [NULL] │ 2012-08-28 00:01:22.487454+02 2012-08-28 00:01:22.990129+02 │ [NULL] │ 6 │ [NULL] │ 2012-08-28 00:01:22.990129+02 (30 ROWS)
Benefit of these columns is very simple – values in them are rising monotonically, so I can use now max() to produce actual groups:
$ WITH base_data AS ( SELECT COALESCE(a.event_when, b.event_when) AS event_when, a.event_count AS count_a, b.event_count AS count_b FROM events_a a FULL OUTER JOIN events_b b USING ( event_when ) ) , with_group_starts AS ( SELECT *, MAX(CASE WHEN count_a IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_a, MAX(CASE WHEN count_b IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_b FROM base_data ) SELECT * FROM with_group_starts ORDER BY event_when; event_when │ count_a │ count_b │ group_a │ group_b ───────────────────────────────┼─────────┼─────────┼───────────────────────────────┼─────────────────────────────── 2012-08-27 23:51:30.54373+02 │ 3 │ [NULL] │ 2012-08-27 23:51:30.54373+02 │ [NULL] 2012-08-27 23:51:34.098968+02 │ [NULL] │ 2 │ 2012-08-27 23:51:30.54373+02 │ 2012-08-27 23:51:34.098968+02 2012-08-27 23:51:43.406024+02 │ [NULL] │ 7 │ 2012-08-27 23:51:30.54373+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:51:56.947008+02 │ 9 │ [NULL] │ 2012-08-27 23:51:56.947008+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:03.252017+02 │ 9 │ [NULL] │ 2012-08-27 23:52:03.252017+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:29.364645+02 │ 7 │ [NULL] │ 2012-08-27 23:52:29.364645+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:53.981979+02 │ 10 │ [NULL] │ 2012-08-27 23:52:53.981979+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:54.833586+02 │ 5 │ [NULL] │ 2012-08-27 23:52:54.833586+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:58.926927+02 │ 6 │ [NULL] │ 2012-08-27 23:52:58.926927+02 │ 2012-08-27 23:51:43.406024+02 2012-08-27 23:52:59.19678+02 │ [NULL] │ 6 │ 2012-08-27 23:52:58.926927+02 │ 2012-08-27 23:52:59.19678+02 2012-08-27 23:53:36.57738+02 │ 6 │ [NULL] │ 2012-08-27 23:53:36.57738+02 │ 2012-08-27 23:52:59.19678+02 2012-08-27 23:53:42.443812+02 │ 9 │ [NULL] │ 2012-08-27 23:53:42.443812+02 │ 2012-08-27 23:52:59.19678+02 2012-08-27 23:54:14.985946+02 │ [NULL] │ 2 │ 2012-08-27 23:53:42.443812+02 │ 2012-08-27 23:54:14.985946+02 2012-08-27 23:54:58.458541+02 │ 1 │ [NULL] │ 2012-08-27 23:54:58.458541+02 │ 2012-08-27 23:54:14.985946+02 2012-08-27 23:56:15.77672+02 │ 9 │ [NULL] │ 2012-08-27 23:56:15.77672+02 │ 2012-08-27 23:54:14.985946+02 2012-08-27 23:56:22.940718+02 │ 3 │ [NULL] │ 2012-08-27 23:56:22.940718+02 │ 2012-08-27 23:54:14.985946+02 2012-08-27 23:56:24.995275+02 │ [NULL] │ 10 │ 2012-08-27 23:56:22.940718+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:57:51.333133+02 │ 4 │ [NULL] │ 2012-08-27 23:57:51.333133+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:58:32.724896+02 │ 1 │ [NULL] │ 2012-08-27 23:58:32.724896+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:58:55.074473+02 │ 7 │ [NULL] │ 2012-08-27 23:58:55.074473+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:59:16.886855+02 │ 8 │ [NULL] │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:59:23.374828+02 │ [NULL] │ 9 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:23.374828+02 2012-08-27 23:59:30.517138+02 │ [NULL] │ 2 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:30.517138+02 2012-08-27 23:59:56.927112+02 │ 9 │ [NULL] │ 2012-08-27 23:59:56.927112+02 │ 2012-08-27 23:59:30.517138+02 2012-08-28 00:00:06.504544+02 │ [NULL] │ 5 │ 2012-08-27 23:59:56.927112+02 │ 2012-08-28 00:00:06.504544+02 2012-08-28 00:00:13.417665+02 │ 2 │ [NULL] │ 2012-08-28 00:00:13.417665+02 │ 2012-08-28 00:00:06.504544+02 2012-08-28 00:00:37.243976+02 │ 6 │ [NULL] │ 2012-08-28 00:00:37.243976+02 │ 2012-08-28 00:00:06.504544+02 2012-08-28 00:00:50.668566+02 │ 5 │ [NULL] │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:00:06.504544+02 2012-08-28 00:01:22.487454+02 │ [NULL] │ 1 │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:01:22.487454+02 2012-08-28 00:01:22.990129+02 │ [NULL] │ 6 │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:01:22.990129+02 (30 ROWS)
Please note that values in group_a and group_b partition data in the resultset into parts which always have only 1 row with value, and possibly multiple rows with null.
For example – all rows with group_a = ‘2012-08-27 23:59:16.886855+02':
2012-08-27 23:59:16.886855+02 │ 8 │ [NULL] │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:56:24.995275+02 2012-08-27 23:59:23.374828+02 │ [NULL] │ 9 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:23.374828+02 2012-08-27 23:59:30.517138+02 │ [NULL] │ 2 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:30.517138+02
Only first of these rows has value in count_a.
So, now we can, using this group/partition info, fill-in the blanks:
$ WITH base_data AS ( SELECT COALESCE(a.event_when, b.event_when) AS event_when, a.event_count AS count_a, b.event_count AS count_b FROM events_a a FULL OUTER JOIN events_b b USING ( event_when ) ) , with_group_starts AS ( SELECT *, MAX(CASE WHEN count_a IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_a, MAX(CASE WHEN count_b IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_b FROM base_data ), filled_in AS ( SELECT *, MAX(count_a) OVER (partition BY group_a) AS real_count_a, MAX(count_b) OVER (partition BY group_b) AS real_count_b FROM with_group_starts ) SELECT * FROM filled_in ORDER BY event_when; event_when │ count_a │ count_b │ group_a │ group_b │ real_count_a │ real_count_b ───────────────────────────────┼─────────┼─────────┼───────────────────────────────┼───────────────────────────────┼──────────────┼────────────── 2012-08-27 23:51:30.54373+02 │ 3 │ [NULL] │ 2012-08-27 23:51:30.54373+02 │ [NULL] │ 3 │ [NULL] 2012-08-27 23:51:34.098968+02 │ [NULL] │ 2 │ 2012-08-27 23:51:30.54373+02 │ 2012-08-27 23:51:34.098968+02 │ 3 │ 2 2012-08-27 23:51:43.406024+02 │ [NULL] │ 7 │ 2012-08-27 23:51:30.54373+02 │ 2012-08-27 23:51:43.406024+02 │ 3 │ 7 2012-08-27 23:51:56.947008+02 │ 9 │ [NULL] │ 2012-08-27 23:51:56.947008+02 │ 2012-08-27 23:51:43.406024+02 │ 9 │ 7 2012-08-27 23:52:03.252017+02 │ 9 │ [NULL] │ 2012-08-27 23:52:03.252017+02 │ 2012-08-27 23:51:43.406024+02 │ 9 │ 7 2012-08-27 23:52:29.364645+02 │ 7 │ [NULL] │ 2012-08-27 23:52:29.364645+02 │ 2012-08-27 23:51:43.406024+02 │ 7 │ 7 2012-08-27 23:52:53.981979+02 │ 10 │ [NULL] │ 2012-08-27 23:52:53.981979+02 │ 2012-08-27 23:51:43.406024+02 │ 10 │ 7 2012-08-27 23:52:54.833586+02 │ 5 │ [NULL] │ 2012-08-27 23:52:54.833586+02 │ 2012-08-27 23:51:43.406024+02 │ 5 │ 7 2012-08-27 23:52:58.926927+02 │ 6 │ [NULL] │ 2012-08-27 23:52:58.926927+02 │ 2012-08-27 23:51:43.406024+02 │ 6 │ 7 2012-08-27 23:52:59.19678+02 │ [NULL] │ 6 │ 2012-08-27 23:52:58.926927+02 │ 2012-08-27 23:52:59.19678+02 │ 6 │ 6 2012-08-27 23:53:36.57738+02 │ 6 │ [NULL] │ 2012-08-27 23:53:36.57738+02 │ 2012-08-27 23:52:59.19678+02 │ 6 │ 6 2012-08-27 23:53:42.443812+02 │ 9 │ [NULL] │ 2012-08-27 23:53:42.443812+02 │ 2012-08-27 23:52:59.19678+02 │ 9 │ 6 2012-08-27 23:54:14.985946+02 │ [NULL] │ 2 │ 2012-08-27 23:53:42.443812+02 │ 2012-08-27 23:54:14.985946+02 │ 9 │ 2 2012-08-27 23:54:58.458541+02 │ 1 │ [NULL] │ 2012-08-27 23:54:58.458541+02 │ 2012-08-27 23:54:14.985946+02 │ 1 │ 2 2012-08-27 23:56:15.77672+02 │ 9 │ [NULL] │ 2012-08-27 23:56:15.77672+02 │ 2012-08-27 23:54:14.985946+02 │ 9 │ 2 2012-08-27 23:56:22.940718+02 │ 3 │ [NULL] │ 2012-08-27 23:56:22.940718+02 │ 2012-08-27 23:54:14.985946+02 │ 3 │ 2 2012-08-27 23:56:24.995275+02 │ [NULL] │ 10 │ 2012-08-27 23:56:22.940718+02 │ 2012-08-27 23:56:24.995275+02 │ 3 │ 10 2012-08-27 23:57:51.333133+02 │ 4 │ [NULL] │ 2012-08-27 23:57:51.333133+02 │ 2012-08-27 23:56:24.995275+02 │ 4 │ 10 2012-08-27 23:58:32.724896+02 │ 1 │ [NULL] │ 2012-08-27 23:58:32.724896+02 │ 2012-08-27 23:56:24.995275+02 │ 1 │ 10 2012-08-27 23:58:55.074473+02 │ 7 │ [NULL] │ 2012-08-27 23:58:55.074473+02 │ 2012-08-27 23:56:24.995275+02 │ 7 │ 10 2012-08-27 23:59:16.886855+02 │ 8 │ [NULL] │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:56:24.995275+02 │ 8 │ 10 2012-08-27 23:59:23.374828+02 │ [NULL] │ 9 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:23.374828+02 │ 8 │ 9 2012-08-27 23:59:30.517138+02 │ [NULL] │ 2 │ 2012-08-27 23:59:16.886855+02 │ 2012-08-27 23:59:30.517138+02 │ 8 │ 2 2012-08-27 23:59:56.927112+02 │ 9 │ [NULL] │ 2012-08-27 23:59:56.927112+02 │ 2012-08-27 23:59:30.517138+02 │ 9 │ 2 2012-08-28 00:00:06.504544+02 │ [NULL] │ 5 │ 2012-08-27 23:59:56.927112+02 │ 2012-08-28 00:00:06.504544+02 │ 9 │ 5 2012-08-28 00:00:13.417665+02 │ 2 │ [NULL] │ 2012-08-28 00:00:13.417665+02 │ 2012-08-28 00:00:06.504544+02 │ 2 │ 5 2012-08-28 00:00:37.243976+02 │ 6 │ [NULL] │ 2012-08-28 00:00:37.243976+02 │ 2012-08-28 00:00:06.504544+02 │ 6 │ 5 2012-08-28 00:00:50.668566+02 │ 5 │ [NULL] │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:00:06.504544+02 │ 5 │ 5 2012-08-28 00:01:22.487454+02 │ [NULL] │ 1 │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:01:22.487454+02 │ 5 │ 1 2012-08-28 00:01:22.990129+02 │ [NULL] │ 6 │ 2012-08-28 00:00:50.668566+02 │ 2012-08-28 00:01:22.990129+02 │ 5 │ 6 (30 ROWS)
Of course this is too much information, so I can simplify the last bits of query to:
$ WITH base_data AS ( SELECT COALESCE(a.event_when, b.event_when) AS event_when, a.event_count AS count_a, b.event_count AS count_b FROM events_a a FULL OUTER JOIN events_b b USING ( event_when ) ) , with_group_starts AS ( SELECT *, MAX(CASE WHEN count_a IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_a, MAX(CASE WHEN count_b IS NULL THEN NULL ELSE event_when END) OVER (ORDER BY event_when) AS group_b FROM base_data ) SELECT event_when, MAX(count_a) OVER (partition BY group_a) AS real_count_a, MAX(count_b) OVER (partition BY group_b) AS real_count_b FROM with_group_starts ORDER BY event_when; event_when │ real_count_a │ real_count_b ───────────────────────────────┼──────────────┼────────────── 2012-08-27 23:51:30.54373+02 │ 3 │ [NULL] 2012-08-27 23:51:34.098968+02 │ 3 │ 2 2012-08-27 23:51:43.406024+02 │ 3 │ 7 2012-08-27 23:51:56.947008+02 │ 9 │ 7 2012-08-27 23:52:03.252017+02 │ 9 │ 7 2012-08-27 23:52:29.364645+02 │ 7 │ 7 2012-08-27 23:52:53.981979+02 │ 10 │ 7 2012-08-27 23:52:54.833586+02 │ 5 │ 7 2012-08-27 23:52:58.926927+02 │ 6 │ 7 2012-08-27 23:52:59.19678+02 │ 6 │ 6 2012-08-27 23:53:36.57738+02 │ 6 │ 6 2012-08-27 23:53:42.443812+02 │ 9 │ 6 2012-08-27 23:54:14.985946+02 │ 9 │ 2 2012-08-27 23:54:58.458541+02 │ 1 │ 2 2012-08-27 23:56:15.77672+02 │ 9 │ 2 2012-08-27 23:56:22.940718+02 │ 3 │ 2 2012-08-27 23:56:24.995275+02 │ 3 │ 10 2012-08-27 23:57:51.333133+02 │ 4 │ 10 2012-08-27 23:58:32.724896+02 │ 1 │ 10 2012-08-27 23:58:55.074473+02 │ 7 │ 10 2012-08-27 23:59:16.886855+02 │ 8 │ 10 2012-08-27 23:59:23.374828+02 │ 8 │ 9 2012-08-27 23:59:30.517138+02 │ 8 │ 2 2012-08-27 23:59:56.927112+02 │ 9 │ 2 2012-08-28 00:00:06.504544+02 │ 9 │ 5 2012-08-28 00:00:13.417665+02 │ 2 │ 5 2012-08-28 00:00:37.243976+02 │ 6 │ 5 2012-08-28 00:00:50.668566+02 │ 5 │ 5 2012-08-28 00:01:22.487454+02 │ 5 │ 1 2012-08-28 00:01:22.990129+02 │ 5 │ 6 (30 ROWS)
And that's it. It would be simpler if we had last() window function (returning last, not null, value for given column in given window), but as you can see, it's doable with what we have available.
It would be nice if some aggregations could ignore nulls .. something like last_value(field IGNORE NULLS) OVER w , for example.
You can simplify a little bit: both case expression try to reconstruct the “event_when” before the “full outer join”. Simply add both “event_when” in “base_data” and refer to that instead of the “cases” expressions:
with base_data as (
select
coalesce(a.event_when, b.event_when) as event_when,
a.event_count as count_a,
b.event_count as count_b,
a.event_when as event_a,
b.event_when as event_b
from
events_a a full outer join events_b b using ( event_when )
),
group_starts as (
select
*,
max(event_a) over (ORDER BY event_when) as group_a,
max(event_b) over (ORDER BY event_when) as group_b
from base_data
)
select
event_when,
max(count_a) over (partition by group_a) as real_count_a,
max(count_b) over (partition by group_b) as real_count_b
from group_starts
I usually try to avoid subselects, but in this case I think the speed might not be bad especially if event_when is unique in each table. So I’d just do:
WITH e AS (SELECT event_when
FROM events_a UNION SELECT event_when FROM events_b)
SELECT
e.event_when
,(SELECT c.event_count FROM events_a As c
WHERE c.event_when <= e.event_when
ORDER BY c.event_when DESC LIMIT 1) As count_a
,(SELECT c.event_count FROM events_b As c
WHERE c.event_when <= e.event_when
ORDER BY c.event_when DESC LIMIT 1) As count_b
FROM
e
ORDER BY
e.event_when;