One of my clients is upgrading some servers. The procedure we have took some time to get to current state, and we found some potential problems, so decided to write more about it.
Author: depesz
Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.
On 23rd of February, Heikki Linnakangas committed patch:
Replace checkpoint_segments with min_wal_size and max_wal_size. Instead of having a single knob (checkpoint_segments) that both triggers checkpoints, and determines how many checkpoints to recycle, they are now separate concerns. There is still an internal variable called CheckpointSegments, which triggers checkpoints. But it no longer determines how many segments to recycle at a checkpoint. That is now auto-tuned by keeping a moving average of the distance between checkpoints (in bytes), and trying to keep that many segments in reserve. The advantage of this is that you can set max_wal_size very high, but the system won't actually consume that much space if there isn't any need for it. The min_wal_size sets a floor for that; you can effectively disable the auto-tuning behavior by setting min_wal_size equal to max_wal_size. The max_wal_size setting is now the actual target size of WAL at which a new checkpoint is triggered, instead of the distance between checkpoints. Previously, you could calculate the actual WAL usage with the formula "(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this patch, you set the desired WAL usage with max_wal_size, and the system calculates the appropriate CheckpointSegments with the reverse of that formula. That's a lot more intuitive for administrators to set. Reviewed by Amit Kapila and Venkata Balaji N.
Continue reading Waiting for 9.5 – Replace checkpoint_segments with min_wal_size and max_wal_size.
I have PostgreSQL, loaded some data, and have app using it. Now what?
I had to deal with this question, or some version of it, quite a few times. So, decided to write a summary on what one could (or should) do, after data is in database, and application is running. Namely – setup some kind of replication and backups.
What to use, how, and why? This is what this post is all about.
Continue reading I have PostgreSQL, loaded some data, and have app using it. Now what?
Returning data in multiple columns
I was working today on some updates to client database. While doing it, I figured it would be simpler if I saw all “codenames" and ids of rows from dictionary table – not so big. But it was bigger than my screen – I have only 90 lines of text on screen, and there were ~ 200 rows of data in the table. So I started thinking – how to show this (codename, id) into more than one column, in psql.
Fixed a bug in OmniPITR
Just thought I'll share a “fun" story. Friend reported weird bug – OmniPITR reported that xlogs are sent to archive, but they actually weren't.
After some checking we found out that he was giving custom rsync-path (–rsync-path – path to rsync program) – and the path was broken.
In this case – OmniPITR was not reporting error, and quite happily was working under assumption that it works OK.
Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.
On 19th of January, Robert Haas committed patch:
Use abbreviated keys for faster sorting of text datums. This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me.
Continue reading Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.
Waiting for 9.5 – vacuumdb: enable parallel mode
On 23rd of January, Alvaro Herrera committed patch:
vacuumdb: enable parallel mode This mode allows vacuumdb to open several server connections to vacuum or analyze several tables simultaneously. Author: Dilip Kumar. Some reworking by Álvaro Herrera Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund
Continue reading Waiting for 9.5 – vacuumdb: enable parallel mode
Reloading of Perl script while it’s running
Recently I saw this discussion on LinkedIn.
In there a guy asks whether modifying script while it's executing will change
the way it executes.
Continue reading Reloading of Perl script while it's running
Changes in explain.depesz.com
This time the changes do not modify how the website works. But they make it simpler to deploy clone of explain.depesz.com on your own server.
Thanks to Petr A. Korobeinikov I got full automation for setting test/development/internal clones of the site. What was done previously manually, now can be done very fast using Vagrant or Puppet.
Great stuff, thanks Petr.
Continue reading Changes in explain.depesz.com
Waiting for 9.5 – Add point <-> polygon distance operator.
On 15th of December, Heikki Linnakangas committed patch:
Add point <-> polygon distance operator. Alexander Korotkov, reviewed by Emre Hasegeli.
Continue reading Waiting for 9.5 – Add point <-> polygon distance operator.