Waiting for PostgreSQL 14 – Add date_bin function

On 24th of March 2021, Peter Eisentraut committed patch:

Add date_bin function
 
Similar to date_trunc, but allows binning by an arbitrary interval
rather than just full units.
 
Author: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Artur Zakirov <zaartur@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACPNZCt4buQFRgy6DyjuZS-2aPDpccRkrJBmgUfwYc1KiaXYxg@mail.gmail.com

This is pretty interesting.

First some background. We have date_trunc function which does:

=$ select 'untruncated' as spec, now()
union all
select spec, date_trunc(spec, now())
from
    unnest('{microseconds,milliseconds,second,minute,hour,day,week,month,quarter,year,decade,century,millennium}'::text[]) as u(spec);
     spec     │              now
──────────────┼───────────────────────────────
 untruncated  │ 2021-03-31 20:27:25.923338+02
 microseconds │ 2021-03-31 20:27:25.923338+02
 milliseconds │ 2021-03-31 20:27:25.923+02
 second2021-03-31 20:27:25+02
 minute2021-03-31 20:27:00+02
 hour2021-03-31 20:00:00+02
 day2021-03-31 00:00:00+02
 week         │ 2021-03-29 00:00:00+02
 month2021-03-01 00:00:00+01
 quarter      │ 2021-01-01 00:00:00+01
 year2021-01-01 00:00:00+01
 decade       │ 2020-01-01 00:00:00+01
 century      │ 2001-01-01 00:00:00+01
 millennium   │ 2001-01-01 00:00:00+01
(14 rows)

Nice. So, this new function, date_bin, but it takes (almost) any interval as base for truncation.

Almost as it can't take any interval with units of months or more (due to varying duration).

Let's see:

=$ select date_bin('5 minutes', now());
ERROR:  function date_bin(unknown, timestamp with time zone) does not exist
LINE 1: select date_bin('5 minutes', now());
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

hmm … careful reason suggests that there is second argument – basically what is base for the bin calculation. So, to get the same results, I'd need to use epoch start as base:

=$ select date_bin('5 minutes', now(), '1970-01-01');
        date_bin
------------------------
 2021-03-31 20:35:00+02
(1 row)

Nice. Of course, I can use any length that can be expressed as interval, and any starting point:

=$ select date_bin('17 minutes 31 seconds'::interval, now(), '2000-01-01');
        date_bin
------------------------
 2021-03-31 20:23:16+02
(1 row)

This is pretty cool.

Of course I don't expect many people needing to group their data in weird durations, like 17 minutes and 31 seconds, but 5/15 minutes can be pretty easy to imagine to be helpful.

Thanks a lot, everyone 🙂

8 thoughts on “Waiting for PostgreSQL 14 – Add date_bin function”

  1. Hi Depesz,

    can you just point out what u(spec) in your first query is doing?
    never has seen that…

    thanks karsten

  2. Hi karsten,

    I wondered as well, never saw it before either.
    I did not find an explanation in the SELECT docs for this, but it is in the docs as well here: https://www.postgresql.org/docs/current/functions-srf.html (search for “AS s(a)”).
    It does not seem to matter if you put s(spec), xx(spec) or just spec, though. So it’s just a different way to specify an ALIAS name.

    Best regards
    Salek

  3. Thanks Salek, with you hint and a second look its clear to me! thx

  4. Hi @karsten

    I think that ‘u’ is the alias name and ‘spec’ is the column name in the resultset. Greetings

  5. I needed this some time ago and ended up with more or less the same function in SQL. It still may be used in pre-14 Postgresql versions.

    create or replace function date_trunc
      (trunc_period interval, ts timestamptz, base_ts timestamptz default '1970-01-01Z')
      returns timestamptz language sql immutable as  $function$
    select
      base_ts
      + floor(extract(epoch from ts - base_ts) / extract(epoch from trunc_period))::bigint
      * trunc_period;
    $function$;

Comments are closed.