Waiting for 9.5 – Add pg_rewind, for re-synchronizing a master server after failback.

On 23rd of March, Heikki Linnakangas committed patch:

Add pg_rewind, for re-synchronizing a master server after failback.
 
Earlier versions of this tool were available (and still are) on github.
 
Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila,
and Satoshi Nagayasu for review.

Continue reading Waiting for 9.5 – Add pg_rewind, for re-synchronizing a master server after failback.

Waiting for 9.5 – Allow foreign tables to participate in inheritance. – A.K.A. PostgreSQL got sharding.

On 22nd of March, Tom Lane committed patch:

Allow foreign tables to participate in inheritance.
 
Foreign tables can now be inheritance children, or parents.  Much of the
system was already ready for this, but we had to fix a few things of
course, mostly in the area of planner and executor handling of row locks.
 
As side effects of this, allow foreign tables to have NOT VALID CHECK
constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to
accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS.  Continuing to
disallow these things would've required bizarre and inconsistent special
cases in inheritance behavior.  Since foreign tables don't enforce CHECK
constraints anyway, a NOT VALID one is a complete no-op, but that doesn't
mean we shouldn't allow it.  And it's possible that some FDWs might have
use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops
for most.
 
An additional change in support of this is that when a ModifyTable node
has multiple target tables, they will all now be explicitly identified
in EXPLAIN output, for example:
 
 Update on pt1  (cost=0.00..321.05 rows=3541 width=46)
   Update on pt1
   Foreign Update on ft1
   Foreign Update on ft2
   Update on child3
   ->  Seq Scan on pt1  (cost=0.00..0.00 rows=1 width=46)
   ->  Foreign Scan on ft1  (cost=100.00..148.03 rows=1170 width=46)
   ->  Foreign Scan on ft2  (cost=100.00..148.03 rows=1170 width=46)
   ->  Seq Scan on child3  (cost=0.00..25.00 rows=1200 width=46)
 
This was done mainly to provide an unambiguous place to attach "Remote SQL"
fields, but it is useful for inherited updates even when no foreign tables
are involved.
 
Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro
Horiguchi, some additional hacking by me

Continue reading Waiting for 9.5 – Allow foreign tables to participate in inheritance. – A.K.A. PostgreSQL got sharding.

Waiting for 9.5 – Use 128-bit math to accelerate some aggregation functions.

On 20th of March, Andres Freund committed patch:

Use 128-bit math to accelerate some aggregation functions.
 
On platforms where we support 128bit integers, use them to implement
faster transition functions for sum(int8), avg(int8),
var_*(int2/int4),stdev_*(int2/int4). Where not supported continue to use
numeric as a transition type.
 
In some synthetic benchmarks this has been shown to provide significant
speedups.
 
Bumps catversion.
 
Discussion: 544BB5F1.50709@proxel.se
Author: Andreas Karlsson
Reviewed-By: Peter Geoghegan, Petr Jelinek, Andres Freund, Oskari Saarenmaa, David Rowley

Continue reading Waiting for 9.5 – Use 128-bit math to accelerate some aggregation functions.

Waiting for 9.5 – array_offset() and array_offsets()

On 18th of March, Alvaro Herrera committed patch:

array_offset() and array_offsets()
 
These functions return the offset position or positions of a value in an
array.
 
Author: Pavel Stěhule
Reviewed by: Jim Nasby

Continue reading Waiting for 9.5 – array_offset() and array_offsets()

Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.

On 23rd of February, Heikki Linnakangas committed patch:

Replace checkpoint_segments with min_wal_size and max_wal_size.
 
Instead of having a single knob (checkpoint_segments) that both triggers
checkpoints, and determines how many checkpoints to recycle, they are now
separate concerns. There is still an internal variable called
CheckpointSegments, which triggers checkpoints. But it no longer determines
how many segments to recycle at a checkpoint. That is now auto-tuned by
keeping a moving average of the distance between checkpoints (in bytes),
and trying to keep that many segments in reserve. The advantage of this is
that you can set max_wal_size very high, but the system won't actually
consume that much space if there isn't any need for it. The min_wal_size
sets a floor for that; you can effectively disable the auto-tuning behavior
by setting min_wal_size equal to max_wal_size.
 
The max_wal_size setting is now the actual target size of WAL at which a
new checkpoint is triggered, instead of the distance between checkpoints.
Previously, you could calculate the actual WAL usage with the formula
"(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this
patch, you set the desired WAL usage with max_wal_size, and the system
calculates the appropriate CheckpointSegments with the reverse of that
formula. That's a lot more intuitive for administrators to set.
 
Reviewed by Amit Kapila and Venkata Balaji N.

Continue reading Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.

I have PostgreSQL, loaded some data, and have app using it. Now what?

I had to deal with this question, or some version of it, quite a few times. So, decided to write a summary on what one could (or should) do, after data is in database, and application is running. Namely – setup some kind of replication and backups.

What to use, how, and why? This is what this post is all about.

Continue reading I have PostgreSQL, loaded some data, and have app using it. Now what?

Returning data in multiple columns

I was working today on some updates to client database. While doing it, I figured it would be simpler if I saw all “codenames" and ids of rows from dictionary table – not so big. But it was bigger than my screen – I have only 90 lines of text on screen, and there were ~ 200 rows of data in the table. So I started thinking – how to show this (codename, id) into more than one column, in psql.

Continue reading Returning data in multiple columns

Fixed a bug in OmniPITR

Just thought I'll share a “fun" story. Friend reported weird bug – OmniPITR reported that xlogs are sent to archive, but they actually weren't.

After some checking we found out that he was giving custom rsync-path (–rsync-path – path to rsync program) – and the path was broken.

In this case – OmniPITR was not reporting error, and quite happily was working under assumption that it works OK.

Continue reading Fixed a bug in OmniPITR

Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.

On 19th of January, Robert Haas committed patch:

Use abbreviated keys for faster sorting of text datums.
 
This commit extends the SortSupport infrastructure to allow operator
classes the option to provide abbreviated representations of Datums;
in the case of text, we abbreviate by taking the first few characters
of the strxfrm() blob.  If the abbreviated comparison is insufficent
to resolve the comparison, we fall back on the normal comparator.
This can be much faster than the old way of doing sorting if the
first few bytes of the string are usually sufficient to resolve the
comparison.
 
There is the potential for a performance regression if all of the
strings to be sorted are identical for the first 8+ characters and
differ only in later positions; therefore, the SortSupport machinery
now provides an infrastructure to abort the use of abbreviation if
it appears that abbreviation is producing comparatively few distinct
keys.  HyperLogLog, a streaming cardinality estimator, is included in
this commit and used to make that determination for text.
 
Peter Geoghegan, reviewed by me.

Continue reading Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.