On 13th of December, Heikki Linnakangas committed patch:
Allow a streaming replication standby to follow a timeline switch. Before this patch, streaming replication would refuse to start replicating if the timeline in the primary doesn't exactly match the standby. The situation where it doesn't match is when you have a master, and two standbys, and you promote one of the standbys to become new master. Promoting bumps up the timeline ID, and after that bump, the other standby would refuse to continue. There's significantly more timeline related logic in streaming replication now. First of all, when a standby connects to primary, it will ask the primary for any timeline history files that are missing from the standby. The missing files are sent using a new replication command TIMELINE_HISTORY, and stored in standby's pg_xlog directory. Using the timeline history files, the standby can follow the latest timeline present in the primary (recovery_target_timeline='latest'), just as it can follow new timelines appearing in an archive directory. START_REPLICATION now takes a TIMELINE parameter, to specify exactly which timeline to stream WAL from. This allows the standby to request the primary to send over WAL that precedes the promotion. The replication protocol is changed slightly (in a backwards-compatible way although there's little hope of streaming replication working across major versions anyway), to allow replication to stop when the end of timeline reached, putting the walsender back into accepting a replication command. Many thanks to Amit Kapila for testing and reviewing various versions of this patch.
I actually had a problem understanding what this patch is about. After all – following timeline change is possible for quite some time now. I did it in 9.1 or 9.2. Phil did it of course too. So what is the idea about this newest patch?
Well. Apparently – the idea is that now the timeline switch can happen when working with only streaming replication – without WAL archive at all.
This is arguably not so good idea (that is: to run without WAL archive), but it's definitely possible, so let's test is.
Here goes a little digression – just when this patch was committed, I had actually quite a lot of problems to get it working. Luckily Heikki was very helpful, and pushed some fixes. Finally, I tested the state as of this commit.
To test it, I did setup three virtual machines:
- ubuntu1 – ip: 172.28.173.140 – master
- ubuntu2 – ip: 172.28.173.141 – slave
- ubuntu3 – ip: 172.28.173.142 – slave
Master has also floating IP used for DB communication – 172.28.173.253. This IP will be taken over by slave (ubuntu2) in my test.
On all machines I have PostgreSQL 9.3 setup, and replication is done from ubuntu1 to ubuntu2, and from ubuntu1 to ubuntu3, but ubuntu2/3, to connect to master, use not .140 IP, but rather the floating IP .253.
On ubuntu2 (slave):
=$ pid=$(head -n1 data/postmaster.pid); ps uwf -p $pid --ppid $pid USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND pgdba 1300 0.0 0.6 170112 12812 pts/0 S 12:28 0:00 /home/pgdba/work/bin/postgres pgdba 1304 0.0 0.0 23604 788 ? Ss 12:28 0:00 \_ postgres: logger process pgdba 1305 0.0 0.0 170220 1536 ? Ss 12:28 0:00 \_ postgres: startup process recovering 000000010000000000000010 pgdba 1306 0.4 0.1 184160 2656 ? Ss 12:28 0:00 \_ postgres: wal receiver process streaming 0/11000000 pgdba 1307 0.0 0.0 170112 952 ? Ss 12:28 0:00 \_ postgres: checkpointer process pgdba 1308 0.0 0.0 170112 1484 ? Ss 12:28 0:00 \_ postgres: writer process pgdba 1309 0.0 0.0 25700 864 ? Ss 12:28 0:00 \_ postgres: stats collector process =$ cat data/recovery.conf standby_mode = 'on' primary_conninfo = 'port=5920 user=replication host=172.28.173.253' trigger_file = '/tmp/finish.replication' recovery_target_timeline='latest'
On ubuntu3 (slave):
=$ pid=$(head -n1 data/postmaster.pid); ps uwf -p $pid --ppid $pid USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND pgdba 1322 0.0 0.6 170108 12824 pts/0 S 12:28 0:00 /home/pgdba/work/bin/postgres pgdba 1326 0.0 0.0 23600 800 ? Ss 12:28 0:00 \_ postgres: logger process pgdba 1327 0.0 0.0 170216 1544 ? Ss 12:28 0:00 \_ postgres: startup process recovering 000000010000000000000010 pgdba 1328 0.3 0.1 184156 2716 ? Ss 12:28 0:00 \_ postgres: wal receiver process streaming 0/11000000 pgdba 1329 0.0 0.0 170108 960 ? Ss 12:28 0:00 \_ postgres: checkpointer process pgdba 1330 0.0 0.0 170108 1492 ? Ss 12:28 0:00 \_ postgres: writer process pgdba 1331 0.0 0.0 25696 872 ? Ss 12:28 0:00 \_ postgres: stats collector process =$ cat data/recovery.conf standby_mode = 'on' primary_conninfo = 'port=5920 user=replication host=172.28.173.253' trigger_file = '/tmp/finish.replication' recovery_target_timeline='latest'
As you can see – both are configured the same way, are connecting to special, floating IP.
On ubuntu1 (master) I can see:
=$ pid=$(head -n1 DATA/postmaster.pid); ps uwf -p $pid --ppid $pid USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND pgdba 1115 0.0 0.6 170112 12884 pts/0 S 12:25 0:00 /home/pgdba/WORK/bin/postgres pgdba 1119 0.0 0.0 23604 804 ? Ss 12:25 0:00 \_ postgres: logger process pgdba 1121 0.0 0.1 170220 2380 ? Ss 12:25 0:00 \_ postgres: checkpointer process pgdba 1122 0.0 0.0 170112 1760 ? Ss 12:25 0:00 \_ postgres: writer process pgdba 1123 0.0 0.0 170112 968 ? Ss 12:25 0:00 \_ postgres: wal writer process pgdba 1124 0.0 0.1 170976 2268 ? Ss 12:25 0:00 \_ postgres: autovacuum launcher process pgdba 1125 0.0 0.0 25700 852 ? Ss 12:25 0:00 \_ postgres: archiver process LAST was 000000010000000000000010 pgdba 1126 0.0 0.0 25832 1044 ? Ss 12:25 0:00 \_ postgres: stats collector process pgdba 1662 0.0 0.1 171328 2824 ? Ss 12:28 0:00 \_ postgres: wal sender process replication 172.28.173.141(45099) streaming 0/11000000 pgdba 1663 0.0 0.1 171196 2824 ? Ss 12:28 0:00 \_ postgres: wal sender process replication 172.28.173.142(45750) streaming 0/11000000 15:45:59 pgdba@ubuntu1 ~ =$ ifconfig | grep -A1 ^eth eth0 Link encap:Ethernet HWaddr 08:00:27:07:98:05 inet addr:172.28.173.140 Bcast:172.28.173.255 Mask:255.255.255.0 -- eth0:0 Link encap:Ethernet HWaddr 08:00:27:07:98:05 inet addr:172.28.173.253 Bcast:172.28.173.255 Mask:255.255.255.0
Now. I want to failover to ubuntu2. Ubuntu1 will be basically discarded. So I do:
- on Ubuntu1:
- pg_ctl -D data/ -m fast stop
- sudo ifdown eth0:0
- on Ubuntu2:
- touch /tmp/finish.replication
- sudo ifup eth0:0
And – after a while – ubuntu3 does reconnect to new master (ubuntu2) and continues with replication. All works.
I also did test it in case where Ubuntu3 was lagging behind replication – after switch it was able to catchup and replicate just fine. Of course – to make this work you have to have sensibly high wal_keep_segments GUC.
Great work, thanks Heikki, Amit and Thom.
Hi there! How are you doing?
Sorry by sending this email to you, but I really need some help/tips to solve my problem.
I’ve got some errors by setting up a new slave.
I’ve described here: http://dba.stackexchange.com/questions/125709/recovery-from-live-to-a-new-slave-server-postgresql-error
I’d appreciate any help. Thank you!