Waiting for PostgreSQL 18 – Add temporal PRIMARY KEY and UNIQUE constraints

On 17th of September 2024, Peter Eisentraut committed patch:

Add temporal PRIMARY KEY and UNIQUE constraints
 
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.
 
(previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new
part is this:)
 
Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations.  For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows.  So we need to forbid empties.
 
This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added.  But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.
 
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com

This took me longer than I wanted to understand the point.

Generally, thanks to this patch we can now track changes in rows, in time, and thanks to another patch that I will cover shortly, keep foreign keys pointed to appropriate row version.

Let's assume you have a table with addresses, something like:

create table addresses (
    id int8 generated always as identity primary key,
    user_id int8 references users (id),
    recipient text not null,
    line1 text not null,
    line2 text,
    zipcode text not null,
    city text not null,
    country text not null,
    phone text not null
);

And then you have orders, where one of order properties is delivery address.

You can, of course, have orders.address_id int8 references addresses (id), but what will happen if someone that already made an order, would then change their address, and make another order.

Generally the change should be active only for 2nd order, and delivery data for first order should be kept as it was.

Of course, there are many solutions to this problem. One could for example, always make new rows, and just make previous version inactive.

But now, thanks to work by developers we have much more interesting solution.

Specifically, we can now have schema like:

create table addresses (
    id int8 generated by default as identity,
    valid_range tstzrange not null default tstzrange(now(), 'infinity', '[)'),
    user_id int8 references users (id),
    recipient text not null,
    line1 text not null,
    line2 text,
    zipcode text not null,
    city text not null,
    country text not null,
    phone text not null,
    primary key (id, valid_range WITHOUT OVERLAPS)
);

Side note: you need to install btree_gist extension first.

This allows me to do stuff like:

=$ insert into addresses (user_id, recipient, line1, zipcode, city, country, phone)
    values (1, 'aa', 'aa', 'aa', 'aa', 'aa', 'aa');
INSERT 0 1
 
=$ insert into addresses (user_id, recipient, line1, zipcode, city, country, phone)
    values (1, 'bb', 'bb', 'bb', 'bb', 'bb', 'bb');
INSERT 0 1
 
=$ select * from addresses;
 id |                valid_range                 | user_id | recipient | line1 | line2 | zipcode | city | country | phone 
----+--------------------------------------------+---------+-----------+-------+-------+---------+------+---------+-------
  3 | ["2024-09-30 12:08:52.703595+02",infinity) |       1 | aa        | aa    |       | aa      | aa   | aa      | aa
  4 | ["2024-09-30 12:08:52.712062+02",infinity) |       1 | bb        | bb    |       | bb      | bb   | bb      | bb
(2 rows)

If I'd need to update the address, I insert new version, and change range for the previous one. Well, the other way around

=$ begin;
BEGIN
 
=$ update addresses
SET valid_range = tstzrange( lower( valid_range ), now(), '[)')
WHERE id = 3;
UPDATE 1
 
=$ insert into addresses (id, user_id, recipient, line1, zipcode, city, country, phone)
    VALUES (3, 1, 'dd', 'dd', 'dd', 'dd', 'dd', 'dd');
INSERT 0 1
 
=$ commit;
COMMIT

After which I get:

=$ select * from addresses;
 id |                            valid_range                            | user_id | recipient | line1 | line2 | zipcode | city | country | phone 
----+-------------------------------------------------------------------+---------+-----------+-------+-------+---------+------+---------+-------
  4 | ["2024-09-30 12:08:52.712062+02",infinity)                        |       1 | bb        | bb    |       | bb      | bb   | bb      | bb
  3 | ["2024-09-30 12:08:52.703595+02","2024-09-30 12:13:32.466304+02") |       1 | aa        | aa    |       | aa      | aa   | aa      | aa
  3 | ["2024-09-30 12:13:32.466304+02",infinity)                        |       1 | dd        | dd    |       | dd      | dd   | dd      | dd
(3 rows)

And since I can't have two rows, with the same id, that overlapr (in terms of valid_range) – given any timestamp, I can tell which version of the row was valid at the time.

This is great. Maybe not for every application there is, but it's definitely very helpful for whenever you have to track changes, and be able to show how the row looked like at any point in time – this will be godsend.

Thanks a lot to everyone involved.

4 thoughts on “Waiting for PostgreSQL 18 – Add temporal PRIMARY KEY and UNIQUE constraints”

Anonymous says:

2024-09-30 at 16:34

Cool. Now, add “automatics”, to do the necessary change on the “old” record, to maximize convenience
mkurz says:

2024-10-01 at 09:29

Thanks! Would it be possible to extend the example with the orders table you mentioned at the beginning and how you then reference and query the orders and get the correct address for an order ?
Thanks!
depesz says:

2024-10-01 at 19:58
@mkurz:

well, it works, kinda:
=$ alter table addresses add column valid_range tstzrange NOT NULL DEFAULT tstzrange(now(), 'infinity', '[)'); =$ alter table addresses drop constraint addresses_pkey ; =$ alter table addresses add primary key (id, valid_range WITHOUT OVERLAPS );
ain’t pretty, and requires dropping/recreating pkey, which can take a while. so I’d much rather suggest to instead migrate table.

Apparently you can’t make unique index first, and then add pkey on it. Maybe it will be added in some future version/patch, or maybe I’m missing a way to do it
Martin says:

2024-10-03 at 16:14

@mkurz Given a orders table with an address_id and order_date you could do something like SELECT * FROM orders o JOIN adresses a ON a.id = o.address_id AND o.valid_range @> o.order_date;