On 28th of January, Fujii Masao committed patch:
Add pg_stat_archiver statistics view. This view shows the statistics about the WAL archiver process's activity. Gabriele Bartolini, reviewed by Michael Paquier, refactored a bit by me.
I'm a sucker for monitoring, so anything new in pg_stat_* area will get my interest.
pg_stat_archiver is supposed to show current state of archiving. How does that work?
Let's see first normal case – there were no errors, all works just fine:
$ SELECT * FROM pg_stat_archiver ; -[ RECORD 1 ]------+------------------------------ archived_count | 14 last_archived_wal | 00000001000000000000000E last_archived_time | 2014-01-29 12:27:45.316713+01 failed_count | 0 last_failed_wal | last_failed_time | [NULL] stats_reset | 2014-01-29 12:27:16.974815+01
Nice. We see last archived information, information about fails. All good. What I would add there is archive_queue size. That is – if there are WAL segments to be archived, but that weren't archived yet – give us the number. That should be relatively simple to do.
But anyway. What happens when there is problem? Let's emulate one:
=$ perl -pi -e "s#^archive_command.*#archive_command = '/bin/false'#" DATA/postgresql.conf =$ pg_ctl -D DATA/ reload server signaled
Now, let's generate some WAL:
$ CREATE TABLE z AS SELECT i, repeat('payload', 100) FROM generate_series(1,100000) i; SELECT 100000
And let's see what the view shows now:
$ SELECT * FROM pg_stat_archiver ; -[ RECORD 1 ]------+------------------------------ archived_count | 14 last_archived_wal | 00000001000000000000000E last_archived_time | 2014-01-29 12:27:45.316713+01 failed_count | 9 last_failed_wal | 00000001000000000000000F last_failed_time | 2014-01-29 12:33:24.622965+01 stats_reset | 2014-01-29 12:27:16.974815+01
Nice. Now, let's bring back successful “archiving":
=$ perl -pi -e "s#^archive_command.*#archive_command = '/bin/true'#" DATA/postgresql.conf =$ pg_ctl -D DATA/ reload server signaled
and after a short while:
$ SELECT * FROM pg_stat_archiver ; -[ RECORD 1 ]------+------------------------------ archived_count | 19 last_archived_wal | 000000010000000000000013 last_archived_time | 2014-01-29 12:35:26.976321+01 failed_count | 12 last_failed_wal | 00000001000000000000000F last_failed_time | 2014-01-29 12:34:26.690591+01 stats_reset | 2014-01-29 12:27:16.974815+01
(switch from failed_count = 9 to 12 happened before my “/bin/true" change in postgresql.conf – while I was writing this blogpost).
It's great addon, and I would just like one small additions: size of archive queue (so we can know if archiving is up to date).
Thanks Gabriele 🙂
Oh, this is most excellent! Thanks Gabriele!
Wonderful. Thanks Gabriele.. and Thanks to Depesz as well for illustrating insight.
Is there any chances of this back patching ?
@Raghav: if you’ll back patch it – yes. Otherwise – no. PostgreSQL doesn’t backpatch new features to older versions.
My pleasure! And thanks to all the reviewers (Vik, Fujii and Michael).
Just stumbled on this post. Nice write-up!
Here’s a vary “hacky” way to see archive queue size on 9.4 via SQL:
select (‘x’||right(pg_xlogfile_name(pg_current_xlog_location()),16))::bit(64)::bigint – (‘x’||right(last_archived_wal,16))::bit(64)::bigint – 1 as xlog_archive_pending from pg_stat_archiver;
But there’s always this method too from any *nix OS (works for most versions of PG too):
ls $PGDATA/pg_xlog/archive_status/*.ready 2>/dev/null|wc -l
–Rob