Waiting for 9.4 – Add new wal_level, logical, sufficient for logical decoding.

On 11th of December, Robert Haas committed patch:

Add new wal_level, logical, sufficient for logical decoding.
 
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
<a class="text" href="/gitweb/?p=postgresql.git;a=object;h=07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65">07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65</a>.  This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all.  Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
 
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
 
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.

Continue reading Waiting for 9.4 – Add new wal_level, logical, sufficient for logical decoding.

What does “Fix VACUUM’s tests to see whether it can update relfrozenxid” really mean?

In release notes to latest release you can find:

Fix VACUUM's tests to see whether it can update relfrozenxid (Andres Freund)
 
In some cases VACUUM (either manual or autovacuum) could incorrectly advance
a table's relfrozenxid value, allowing tuples to escape freezing, causing
those rows to become invisible once 2^31 transactions have elapsed. The
probability of data loss is fairly low since multiple incorrect advancements
would need to happen before actual loss occurs, but it's not zero. In 9.2.0
and later, the probability of loss is higher, and it's also possible to get
"could not access status of transaction" errors as a consequence of this
bug. Users upgrading from releases 9.0.4 or 8.4.8 or earlier are not
affected, but all later versions contain the bug.
 
The issue can be ameliorated by, after upgrading, vacuuming all tables in
all databases while having vacuum_freeze_table_age set to zero. This will
fix any latent corruption but will not be able to fix all pre-existing data
errors. However, an installation can be presumed safe after performing this
vacuuming if it has executed fewer than 2^31 update transactions in its
lifetime (check this with SELECT txid_current() < 2^31).

What does it really mean?

Continue reading What does “Fix VACUUM's tests to see whether it can update relfrozenxid" really mean?

Waiting for 9.4 – Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.

On 22nd of November, Tom Lane committed patch:

Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
 
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry.  The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others.  This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
 
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
 
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
 
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does.  There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST().  After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
 
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me

Continue reading Waiting for 9.4 – Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.

Bloat removal without table swapping

Some time ago I wrote about my favorite method of bloat removal. Around one year earlier, I wrote about another idea for bloat removal. This older idea was great – it didn't involve usage of triggers, overhead on all writes, table swapping. It had just one small, tiny, minuscule little issue. It was unbearably slow.

My idea was explored by Nathan Thom, but his blogpost disappeared.

Recently, Sergey Konoplev wrote to me about his tool, that he wrote using the same idea – updating rows to move them to other pages. So I decided that I have to check it.

Continue reading Bloat removal without table swapping

What is the overhead of logging?

update at the end of the post!

There obviously is one – after all, logging information has to be more expensive than not logging it. But how big is it? And more importantly, what is the difference between logging to stderr/file, csvlog and syslog? And what about syslog to remote machine?

Let's see.

Continue reading What is the overhead of logging?