Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.

On 23rd of February, Heikki Linnakangas committed patch:

Replace checkpoint_segments with min_wal_size and max_wal_size.
 
Instead of having a single knob (checkpoint_segments) that both triggers
checkpoints, and determines how many checkpoints to recycle, they are now
separate concerns. There is still an internal variable called
CheckpointSegments, which triggers checkpoints. But it no longer determines
how many segments to recycle at a checkpoint. That is now auto-tuned by
keeping a moving average of the distance between checkpoints (in bytes),
and trying to keep that many segments in reserve. The advantage of this is
that you can set max_wal_size very high, but the system won't actually
consume that much space if there isn't any need for it. The min_wal_size
sets a floor for that; you can effectively disable the auto-tuning behavior
by setting min_wal_size equal to max_wal_size.
 
The max_wal_size setting is now the actual target size of WAL at which a
new checkpoint is triggered, instead of the distance between checkpoints.
Previously, you could calculate the actual WAL usage with the formula
"(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this
patch, you set the desired WAL usage with max_wal_size, and the system
calculates the appropriate CheckpointSegments with the reverse of that
formula. That's a lot more intuitive for administrators to set.
 
Reviewed by Amit Kapila and Venkata Balaji N.

Continue reading Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.

I have PostgreSQL, loaded some data, and have app using it. Now what?

I had to deal with this question, or some version of it, quite a few times. So, decided to write a summary on what one could (or should) do, after data is in database, and application is running. Namely – setup some kind of replication and backups.

What to use, how, and why? This is what this post is all about.

Continue reading I have PostgreSQL, loaded some data, and have app using it. Now what?

Returning data in multiple columns

I was working today on some updates to client database. While doing it, I figured it would be simpler if I saw all “codenames" and ids of rows from dictionary table – not so big. But it was bigger than my screen – I have only 90 lines of text on screen, and there were ~ 200 rows of data in the table. So I started thinking – how to show this (codename, id) into more than one column, in psql.

Continue reading Returning data in multiple columns

Fixed a bug in OmniPITR

Just thought I'll share a “fun" story. Friend reported weird bug – OmniPITR reported that xlogs are sent to archive, but they actually weren't.

After some checking we found out that he was giving custom rsync-path (–rsync-path – path to rsync program) – and the path was broken.

In this case – OmniPITR was not reporting error, and quite happily was working under assumption that it works OK.

Continue reading Fixed a bug in OmniPITR

Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.

On 19th of January, Robert Haas committed patch:

Use abbreviated keys for faster sorting of text datums.
 
This commit extends the SortSupport infrastructure to allow operator
classes the option to provide abbreviated representations of Datums;
in the case of text, we abbreviate by taking the first few characters
of the strxfrm() blob.  If the abbreviated comparison is insufficent
to resolve the comparison, we fall back on the normal comparator.
This can be much faster than the old way of doing sorting if the
first few bytes of the string are usually sufficient to resolve the
comparison.
 
There is the potential for a performance regression if all of the
strings to be sorted are identical for the first 8+ characters and
differ only in later positions; therefore, the SortSupport machinery
now provides an infrastructure to abort the use of abbreviation if
it appears that abbreviation is producing comparatively few distinct
keys.  HyperLogLog, a streaming cardinality estimator, is included in
this commit and used to make that determination for text.
 
Peter Geoghegan, reviewed by me.

Continue reading Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.

Waiting for 9.5 – vacuumdb: enable parallel mode

On 23rd of January, Alvaro Herrera committed patch:

vacuumdb: enable parallel mode
 
This mode allows vacuumdb to open several server connections to vacuum
or analyze several tables simultaneously.
 
Author: Dilip Kumar.  Some reworking by Álvaro Herrera
Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund

Continue reading Waiting for 9.5 – vacuumdb: enable parallel mode

Changes in explain.depesz.com

This time the changes do not modify how the website works. But they make it simpler to deploy clone of explain.depesz.com on your own server.

Thanks to Petr A. Korobeinikov I got full automation for setting test/development/internal clones of the site. What was done previously manually, now can be done very fast using Vagrant or Puppet.

Great stuff, thanks Petr.
Continue reading Changes in explain.depesz.com