How much speed you’re leaving at the table if you use default locale?

I've been to PGConf.dev recently, and one of the talks was about collations.

The whole talk was interesting (to put it mildly), but the thing that stuck with me is that we really shouldn't be using default collation provider (libc with locale based collation), unless it's really needed, because we're wasting performance. But how much of a hit is it?

Continue reading How much speed you're leaving at the table if you use default locale?

Waiting for PostgreSQL 17 – MERGE / SPLIT partitions

I thought about it for quite some time, whether I should write about it, and how. That's why there is delay since:

On 6th of April 2024, Alexander Korotkov committed patch:

Implement ALTER TABLE ... MERGE PARTITIONS ... command
 
This new DDL command merges several partitions into the one partition of the
target table.  The target partition is created using new
createPartitionTable() function with parent partition as the template.
 
This commit comprises quite naive implementation which works in single process
and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the
operations including the tuple routing.  This is why this new DDL command
can't be recommended for large partitioned tables under a high load.  However,
this implementation come in handy in certain cases even as is.
Also, it could be used as a foundation for future implementations with lesser
locking and possibly parallel.
 
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval
Reviewed-by: Matthias van de Meent, Laurenz Albe, Zhihong Yu, Justin Pryzby
Reviewed-by: Alvaro Herrera, Robert Haas, Stephane Tachoires

and, also on 6th of April 2024, Alexander Korotkov committed patch:

Implement ALTER TABLE ... SPLIT PARTITION ... command
 
This new DDL command splits a single partition into several parititions.
Just like ALTER TABLE ... MERGE PARTITIONS ... command, new patitions are
created using createPartitionTable() function with parent partition as the
template.
 
This commit comprises quite naive implementation which works in single process
and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the
operations including the tuple routing.  This is why this new DDL command
can't be recommended for large partitioned tables under a high load.  However,
this implementation come in handy in certain cases even as is.
Also, it could be used as a foundation for future implementations with lesser
locking and possibly parallel.
 
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval
Reviewed-by: Matthias van de Meent, Laurenz Albe, Zhihong Yu, Justin Pryzby
Reviewed-by: Alvaro Herrera, Robert Haas, Stephane Tachoires

Continue reading Waiting for PostgreSQL 17 – MERGE / SPLIT partitions

Waiting for …: SQL/JSON is coming back. Hopefully.

This is not the usual Waiting for post, but something should be said.

Back in March/April of 2022 Andrew Dunstan committed a series of patches that added support for lots of really interesting features from SQL/JSON standard.

While I'm not avid user of json in database, I was very, very happy. Wrote couple of blogposts about it.

Then, around six month later they got reverted.

Lately, since last year, actually, these re-appeared again:

  1. Commit by Alvaro Herrera, from March 29th, 2023: SQL/JSON: add standard JSON constructor functions
  2. Commit by Alvaro Herrera, from March 31st, 2023: SQL/JSON: support the IS JSON predicate
  3. Commit by Amit Langote, from July 20th, 2023: Add more SQL/JSON constructor functions
  4. Commit by Amit Langote, from March 21st, 2024: Add SQL/JSON query functions
  5. Commit by Amit Langote, from April 4th, 2024: Add basic JSON_TABLE() functionality

Since they re-appeared I was asked (twice) to write about them in the Waiting for series.

So, I just want to say that while I did notice these changes, and am very happy that they are there, I don't plan on writing Waiting for about them.

The reason is simple: I kinda have the feeling that I already wrote about waiting for them.

What I can say though, is that as soon as PostgreSQL version (be it 17, or any other) will get released with these in there, I will reblog about SQL/JSON, with updated examples, so that this huge functionality, and astounding amount of work by developers and testers, will get as much publicity as it can.

For now: I hope it will make it to Pg 17 release, and even before that I would like to thank everyone involved. By my quick count we have at least nine separate authors, and fifteen reviewers, and this is just across these five commits I mentioned.

THANK YOU – can't wait till I will be able to write about it properly 🙂

Waiting for PostgreSQL 17 – Invent SERIALIZE option for EXPLAIN.

On 3rd of April 2024, Tom Lane committed patch:

Invent SERIALIZE option for EXPLAIN.
 
EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about
the volume of data emitted by a query, as well as the time taken
to convert the data to the on-the-wire format.  Previously there
was no way to investigate this without actually sending the data
to the client, in which case network transmission costs might
swamp what you wanted to see.  In particular this feature allows
investigating the costs of de-TOASTing compressed or out-of-line
data during formatting.
 
Stepan Rutz and Matthias van de Meent,
reviewed by Tomas Vondra and myself
 
Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de

Continue reading Waiting for PostgreSQL 17 – Invent SERIALIZE option for EXPLAIN.

Waiting for PostgreSQL 17 – Add new COPY option LOG_VERBOSITY.

On 1st of April 2024, Masahiko Sawada committed patch:

Add new COPY option LOG_VERBOSITY.
 
This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.
 
This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, a NOTICE message is emitted for
each discarded row, providing additional information such as line
number, column name, and the malformed value. This helps users to
identify problematic rows that failed to load.
 
Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add new COPY option LOG_VERBOSITY.

Waiting for PostgreSQL 17 – Add support for MERGE … WHEN NOT MATCHED BY SOURCE.

On 30th of March 2024, Dean Rasheed committed patch:

Add support for MERGE ... WHEN NOT MATCHED BY SOURCE.
 
This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE
actions, which operate on rows that exist in the target relation, but
not in the data source. These actions can execute UPDATE, DELETE, or
DO NOTHING sub-commands.
 
This is in contrast to already-supported WHEN NOT MATCHED actions,
which operate on rows that exist in the data source, but not in the
target relation. To make this distinction clearer, such actions may
now be written as WHEN NOT MATCHED BY TARGET.
 
Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is
equivalent to writing WHEN NOT MATCHED BY TARGET.
 
Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing.
 
Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add support for MERGE … WHEN NOT MATCHED BY SOURCE.

Waiting for PostgreSQL 17 – Add RETURNING support to MERGE.

On 17th of March 2024, Dean Rasheed committed patch:

Add RETURNING support to MERGE.
 
This allows a RETURNING clause to be appended to a MERGE query, to
return values based on each row inserted, updated, or deleted. As with
plain INSERT, UPDATE, and DELETE commands, the returned values are
based on the new contents of the target table for INSERT and UPDATE
actions, and on its old contents for DELETE actions. Values from the
source relation may also be returned.
 
As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be
used as the source relation for other operations such as WITH queries
and COPY commands.
 
Additionally, a special function merge_action() is provided, which
returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action
executed for each row. The merge_action() function can be used
anywhere in the RETURNING list, including in arbitrary expressions and
subqueries, but it is an error to use it anywhere outside of a MERGE
query's RETURNING list.
 
Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera,
Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut,
and Wolfgang Walther.
 
Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add RETURNING support to MERGE.

Waiting for PostgreSQL 17 – Add new COPY option SAVE_ERROR_TO / Rename COPY option from SAVE_ERROR_TO to ON_ERROR

On 16th of January 2024, Alexander Korotkov committed patch:

Add new COPY option SAVE_ERROR_TO
 
Currently, when source data contains unexpected data regarding data type or
range, the entire COPY fails. However, in some cases, such data can be ignored
and just copying normal data is preferable.
 
This commit adds a new option SAVE_ERROR_TO, which specifies where to save the
error information. When this option is specified, COPY skips soft errors and
continues copying.
 
Currently, SAVE_ERROR_TO only supports "none". This indicates error information
is not saved and COPY just skips the unexpected data and continues running.
 
Later works are expected to add more choices, such as 'log' and 'table'.
 
Author: Damir Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He
Discussion: https://postgr.es/m/87k31ftoe0.fsf_-_%40commandprompt.com
Reviewed-by: Pavel Stehule, Andres Freund, Tom Lane, Daniel Gustafsson,
Reviewed-by: Alena Rybakina, Andy Fan, Andrei Lepikhov, Masahiko Sawada
Reviewed-by: Vignesh C, Atsushi Torikoshi

and then, three days later, he changed the syntax in next patch:

Rename COPY option from SAVE_ERROR_TO to ON_ERROR
 
The option names now are "stop" (default) and "ignore".  The future options
could be "file 'filename.log'" and "table 'tablename'".
 
Discussion: https://postgr.es/m/20240117.164859.2242646601795501168.horikyota.ntt%40gmail.com
Author: Jian He
Reviewed-by: Atsushi Torikoshi

Continue reading Waiting for PostgreSQL 17 – Add new COPY option SAVE_ERROR_TO / Rename COPY option from SAVE_ERROR_TO to ON_ERROR

Waiting for PostgreSQL 17 – Support identity columns in partitioned tables

On 16th of January 2024, Peter Eisentraut committed patch:

Support identity columns in partitioned tables
 
Previously, identity columns were disallowed on partitioned tables.
(The reason was mainly that no one had gotten around to working
through all the details to make it work.)  This makes it work now.
 
Some details on the behavior:
 
* A newly created partition inherits identity property
 
  The partitions of a partitioned table are integral part of the
  partitioned table.  A partition inherits identity columns from the
  partitioned table.  An identity column of a partition shares the
  identity space with the corresponding column of the partitioned
  table.  In other words, the same identity column across all
  partitions of a partitioned table share the same identity space.
  This is effected by sharing the same underlying sequence.
 
  When INSERTing directly into a partition, the sequence associated
  with the topmost partitioned table is used to calculate the value of
  the corresponding identity column.
 
  In regular inheritance, identity columns and their properties in a
  child table are independent of those in its parent tables.  A child
  table does not inherit identity columns or their properties
  automatically from the parent.  (This is unchanged.)
 
* Attached partition inherits identity column
 
  A table being attached as a partition inherits the identity property
  from the partitioned table.  This should be fine since we expect
  that the partition table's column has the same type as the
  partitioned table's corresponding column.  If the table being
  attached is a partitioned table, the identity properties are
  propagated down its partition hierarchy.
 
  An identity column in the partitioned table is also marked as NOT
  NULL.  The corresponding column in the partition needs to be marked
  as NOT NULL for the attach to succeed.
 
* Drop identity property when detaching partition
 
  A partition's identity column shares the identity space
  (i.e. underlying sequence) as the corresponding column of the
  partitioned table.  If a partition is detached it can longer share
  the identity space as before.  Hence the identity columns of the
  partition being detached loose their identity property.
 
  When identity of a column of a regular table is dropped it retains
  the NOT NULL constraint that came with the identity property.
  Similarly the columns of the partition being detached retain the NOT
  NULL constraints that came with identity property, even though the
  identity property itself is lost.
 
  The sequence associated with the identity property is linked to the
  partitioned table (and not the partition being detached).  That
  sequence is not dropped as part of detach operation.
 
* Partitions with their own identity columns are not allowed.
 
* The usual ALTER operations (add identity column, add identity
  property to existing column, alter properties of an indentity
  column, drop identity property) are supported for partitioned
  tables.  Changing a column only in a partitioned table or a
  partition is not allowed; the change needs to be applied to the
  whole partition hierarchy.
 
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAExHW5uOykuTC+C6R1yDSp=o8Q83jr8xJdZxgPkxfZ1Ue5RRGg@mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Support identity columns in partitioned tables