Grouping data into array of sums – fun with custom aggregates

Was asked recently about optimization of interesting case. There was table like:

=$ CREATE TABLE input_data (
    category_id        INT8,
    object_id          INT8,
    interaction_ts     timestamptz,
    interaction_type    TEXT,
    interaction_count  INT4
);

And there was a code that was grouping it all by sum()ing interaction_count per category, object, interaction_type, and timestamp truncated to hour.

Basically, storing somewhere result of:

=$ select
    category_id,
    object_id,
    date_trunc( 'hour', interaction_ts ) as ts,
    sum(interaction_count) filter (where interaction_type = 'a') as a_count,
    sum(interaction_count) filter (where interaction_type = 'b') as b_count
from
    input_data
group by 1, 2, 3;

While talking about optimizations, one idea that came was to store whole day of counts in single row, as array. So the resulting count table would be:

=$ create table results (
    category_id      int8,
    object_id        int8
    interaction_day  date,
    a_counts         int4[],
    b_counts         int4[]
);

Where a_counts, and b_counts would always have 24 elements, one for each hour.

Now, how to roll it up like this?

Continue reading Grouping data into array of sums – fun with custom aggregates