Waiting for PostgreSQL 14 – Add date_bin function

On 24th of March 2021, Peter Eisentraut committed patch:

Add date_bin function
 
Similar to date_trunc, but allows binning by an arbitrary interval
rather than just full units.
 
Author: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Artur Zakirov <zaartur@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACPNZCt4buQFRgy6DyjuZS-2aPDpccRkrJBmgUfwYc1KiaXYxg@mail.gmail.com

This is pretty interesting.

First some background. We have date_trunc function which does:

=$ select 'untruncated' as spec, now()
union all
select spec, date_trunc(spec, now())
from
    unnest('{microseconds,milliseconds,second,minute,hour,day,week,month,quarter,year,decade,century,millennium}'::text[]) as u(spec);
     spec     │              now
──────────────┼───────────────────────────────
 untruncated  │ 2021-03-31 20:27:25.923338+02
 microseconds │ 2021-03-31 20:27:25.923338+02
 milliseconds │ 2021-03-31 20:27:25.923+02
 second       │ 2021-03-31 20:27:25+02
 minute       │ 2021-03-31 20:27:00+02
 hour         │ 2021-03-31 20:00:00+02
 day          │ 2021-03-31 00:00:00+02
 week         │ 2021-03-29 00:00:00+02
 month        │ 2021-03-01 00:00:00+01
 quarter      │ 2021-01-01 00:00:00+01
 year         │ 2021-01-01 00:00:00+01
 decade       │ 2020-01-01 00:00:00+01
 century      │ 2001-01-01 00:00:00+01
 millennium   │ 2001-01-01 00:00:00+01
(14 rows)

Nice. So, this new function, date_bin, but it takes (almost) any interval as base for truncation.

Almost as it can't take any interval with units of months or more (due to varying duration).

Let's see:

=$ select date_bin('5 minutes', now());
ERROR:  function date_bin(unknown, timestamp with time zone) does not exist
LINE 1: select date_bin('5 minutes', now());
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

hmm … careful reason suggests that there is second argument – basically what is base for the bin calculation. So, to get the same results, I'd need to use epoch start as base:

=$ select date_bin('5 minutes', now(), '1970-01-01');
        date_bin
------------------------
 2021-03-31 20:35:00+02
(1 row)

Nice. Of course, I can use any length that can be expressed as interval, and any starting point:

=$ select date_bin('17 minutes 31 seconds'::interval, now(), '2000-01-01');
        date_bin
------------------------
 2021-03-31 20:23:16+02
(1 row)

This is pretty cool.

Of course I don't expect many people needing to group their data in weird durations, like 17 minutes and 31 seconds, but 5/15 minutes can be pretty easy to imagine to be helpful.

Thanks a lot, everyone 🙂

8 thoughts on “Waiting for PostgreSQL 14 – Add date_bin function”

Hi Depesz,

very cool! I just read about the time_bucket function in TimescaleDB (https://docs.timescale.com/latest/api#time_bucket) two days ago and thought that this is a very cool feature. Nice to see this in PostgreSQL itself as well.

Best regards
Salek

Hi Depesz,

can you just point out what u(spec) in your first query is doing?
never has seen that…

thanks karsten

Hi karsten,

I wondered as well, never saw it before either.
I did not find an explanation in the SELECT docs for this, but it is in the docs as well here: https://www.postgresql.org/docs/current/functions-srf.html (search for “AS s(a)”).
It does not seem to matter if you put s(spec), xx(spec) or just spec, though. So it’s just a different way to specify an ALIAS name.

Best regards
Salek

Thanks Salek, with you hint and a second look its clear to me! thx

Hi @karsten

I think that ‘u’ is the alias name and ‘spec’ is the column name in the resultset. Greetings

this solves this problem, but better right ? https://www.depesz.com/2010/10/22/grouping-data-into-time-ranges/

@Mark:
yes. Built-in way to do the thing 🙂

I needed this some time ago and ended up with more or less the same function in SQL. It still may be used in pre-14 Postgresql versions.

create or replace function date_trunc
  (trunc_period interval, ts timestamptz, base_ts timestamptz default '1970-01-01Z')
  returns timestamptz language sql immutable as  $function$
select
  base_ts
  + floor(extract(epoch from ts - base_ts) / extract(epoch from trunc_period))::bigint
  * trunc_period;
$function$;

Comments are closed.