pg_terminator released

I just released first version of pg_terminator.

It's a tool that is supposed to be run on PostgreSQL db server, monitor a database, and cancel or terminate offending queries/connections.

It can be used, for example to:

  • kill leftover psql sessions that are not doing anything
  • cancel too long queries on production servers
  • kill connections that have long idle-in-transaction state
  • make sure that some application will not use transactions longer than “x"
  • and more …

Basically – if you can write a where condition, that operates on pg_stat_activity view, that can list backends to cancel/terminate – you can make pg_terminate do it for you.

To work, it requires Ruby, and its Pg library.

Usage is free, and it's fully open-source (BSD license).

Waiting for PostgreSQL 12 – Allow multi-inserts during COPY into a partitioned table

On 1st of August 2018, Peter Eisentraut committed patch:

Allow multi-inserts during COPY into a partitioned table
 
 
CopyFrom allows multi-inserts to be used for non-partitioned tables, but
this was disabled for partitioned tables.  The reason for this appeared
to be that the tuple may not belong to the same partition as the
previous tuple did.  Not allowing multi-inserts here greatly slowed down
imports into partitioned tables.  These could take twice as long as a
copy to an equivalent non-partitioned table.  It seems wise to do
something about this, so this change allows the multi-inserts by
flushing the so-far inserted tuples to the partition when the next tuple
does not belong to the same partition, or when the buffer fills.  This
improves performance when the next tuple in the stream commonly belongs
to the same partition as the previous tuple.
 
In cases where the target partition changes on every tuple, using
multi-inserts slightly slows the performance.  To get around this we
track the average size of the batches that have been inserted and
adaptively enable or disable multi-inserts based on the size of the
batch.  Some testing was done and the regression only seems to exist
when the average size of the insert batch is close to 1, so let's just
enable multi-inserts when the average size is at least 1.3.  More
performance testing might reveal a better number for, this, but since
the slowdown was only 1-2% it does not seem critical enough to spend too
much time calculating it.  In any case it may depend on other factors
rather than just the size of the batch.
 
Allowing multi-inserts for partitions required a bit of work around the
per-tuple memory contexts as we must flush the tuples when the next
tuple does not belong the same partition.  In which case there is no
good time to reset the per-tuple context, as we've already built the new
tuple by this time.  In order to work around this we maintain two
per-tuple contexts and just switch between them every time the partition
changes and reset the old one.  This does mean that the first of each
batch of tuples is not allocated in the same memory context as the
others, but that does not matter since we only reset the context once
the previous batch has been inserted.
 
Author: David Rowley <david.rowley@2ndquadrant.com>

Continue reading Waiting for PostgreSQL 12 – Allow multi-inserts during COPY into a partitioned table

I’m starting my wait for PostgreSQL 12

On Saturday, 30th of June, Andrew Dunstan stamped HEAD in git as 12devel.

This means that there will be no new features in Pg11. And now, my test Pg reports:

=$ psql -c 'select version()'
                                                  version                                                   
════════════════════════════════════════════════════════════════════════════════════════════════════════════
 PostgreSQL 12devel ON x86_64-pc-linux-gnu, compiled BY gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
(1 ROW)

Nice 🙂

Waiting for PostgreSQL 11 – Improve run-time partition pruning to handle any stable expression.

On 10th of June 2018, Tom Lane committed patch:

Improve run-time partition pruning to handle any stable expression.
 
 
The initial coding of the run-time-pruning feature only coped with cases
where the partition key(s) are compared to Params.  That is a bit silly;
we can allow it to work with any non-Var-containing stable expression, as
long as we take special care with expressions containing PARAM_EXEC Params.
The code is hardly any longer this way, and it's considerably clearer
(IMO at least).  Per gripe from Pavel Stehule.
 
David Rowley, whacked around a bit by me
 
Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com

Continue reading Waiting for PostgreSQL 11 – Improve run-time partition pruning to handle any stable expression.

Waiting for PostgreSQL 11 – Support partition pruning at execution time

On 7th of April 2018, Alvaro Herrera committed patch:

Support partition pruning at execution time
 
 
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.
 
This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:
 
1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.
 
2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.
 
Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.
 
For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.
 
This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.
 
Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com

Continue reading Waiting for PostgreSQL 11 – Support partition pruning at execution time

Waiting for PostgreSQL 11 – Indexes with INCLUDE columns and their support in B-tree

On 7th of April 2018, Teodor Sigaev committed patch:

Indexes with INCLUDE columns and their support in B-tree
 
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.
 
Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.
 
In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.
 
Bump catalog version
 
Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
             David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/.4010101@postgrespro.ru

Continue reading Waiting for PostgreSQL 11 – Indexes with INCLUDE columns and their support in B-tree

Waiting for PostgreSQL 11 – Add json(b)_to_tsvector function

On 7th of April 2018, Teodor Sigaev committed patch:

Add json(b)_to_tsvector function
 
Jsonb has a complex nature so there isn't best-for-everything way to convert it
to tsvector for full text search. Current to_tsvector(json(b)) suggests to
convert only string values, but it's possible to index keys, numerics and even
booleans value. To solve that json(b)_to_tsvector has a second required
argument contained a list of desired types of json fields. Second argument is
a jsonb scalar or array right now with possibility to add new options in a
future.
 
Bump catalog version
 
Author: Dmitry Dolgov with some editorization by me
Reviewed by: Teodor Sigaev
Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com

Continue reading Waiting for PostgreSQL 11 – Add json(b)_to_tsvector function

Waiting for PostgreSQL 11 – Transforms for jsonb to PL/Python and to PL/Perl

On 28th of March 2018, Peter Eisentraut committed patch:

Transforms for jsonb to PL/Python
 
Add a new contrib module jsonb_plpython that provide a transform between
jsonb and PL/Python.  jsonb values are converted to appropriate Python
types such as dicts and lists, and vice versa.
 
Author: Anthony Bykov <a.bykov@postgrespro.ru>

and then, on 3rd of April 2018, he also committed patch:

Transforms for jsonb to PL/Perl
 
Add a new contrib module jsonb_plperl that provides a transform between
jsonb and PL/Perl.  jsonb values are converted to appropriate Perl types
such as arrays and hashes, and vice versa.
 
Author: Anthony Bykov <a.bykov@postgrespro.ru>

Continue reading Waiting for PostgreSQL 11 – Transforms for jsonb to PL/Python and to PL/Perl