There is this idea that from 9.5 we will be able to upgrade pg clusters to newer, major, version, without downtime (well, almost), thanks to magic of logical replication.
Sounds very promising, so I figured – I'll test it. To some extent at least.
First, I need some 9.5 database. This is relatively simple, luckily. Got myself an ubuntu box with these packages:
# dpkg -l | grep postgresql ii pgdg-keyring 2014.1 all keyring for apt.postgresql.org ii postgresql-9.5 9.5.4-1.pgdg14.04+1 amd64 object-relational SQL database, version 9.5 server ii postgresql-9.5-dbg 9.5.4-1.pgdg14.04+1 amd64 debug symbols for postgresql-9.5 ii postgresql-9.5-pg-collkey 0.5.1-1insops1 amd64 ICU collation function wrapper for PostgreSQL 9.5 ii postgresql-9.5-postgis-2.2 2.2.2+dfsg-4.pgdg14.04+1 amd64 Geographic objects support for PostgreSQL 9.5 ii postgresql-9.5-postgis-2.2-scripts 2.2.2+dfsg-4.pgdg14.04+1 all Geographic objects support for PostgreSQL 9.5 -- scripts ii postgresql-9.5-postgis-scripts 2.2.2+dfsg-4.pgdg14.04+1 all transitional dummy package ii postgresql-client-9.5 9.5.4-1.pgdg14.04+1 amd64 front-end programs for PostgreSQL 9.5 ii postgresql-client-common 175.pgdg14.04+1 all manager for multiple PostgreSQL client versions ii postgresql-common 175.pgdg14.04+1 all PostgreSQL database-cluster manager ii postgresql-contrib-9.5 9.5.4-1.pgdg14.04+1 amd64 additional facilities for PostgreSQL ii postgresql-plperl-9.5 9.5.4-1.pgdg14.04+1 amd64 PL/Perl procedural language for PostgreSQL 9.5
Since this is a test installation, I don't have anything there, so let's quickly add some test database and table and perhaps even some data 🙂
postgres@h3po4:~$ createuser test postgres@h3po4:~$ createdb -O test test
To make sure that there is something happening in the database, I wrote couple of simple scripts:
-- create.sql CREATE TABLE base_data ( id serial PRIMARY KEY, VALUE INT4 NOT NULL DEFAULT 0 ); CREATE TABLE data_changes ( id serial PRIMARY KEY, data_id INT8 NOT NULL REFERENCES base_data (id), changed_when timestamptz, change_from INT4, change_to INT4 ); INSERT INTO base_data (id, VALUE) SELECT i, 0 FROM generate_series(1,1000) i; SELECT SETVAL('base_data_id_seq', 1001);
-- load-single.sql WITH random_data AS ( SELECT 1 + FLOOR(random() * 1000) AS random_id, FLOOR(random() * 201 - 100) AS random_change ), data_update AS ( UPDATE base_data AS bd SET VALUE = bd.value + r.random_change FROM random_data r WHERE bd.id = r.random_id returning bd.id, r.random_change, bd.value ), insert_log AS ( INSERT INTO data_changes (data_id, changed_when, change_from, change_to) SELECT du.id, now(), du.value - du.random_change, du.value FROM data_update du returning id ) DELETE FROM data_changes WHERE id < ( SELECT MIN(id - 10000) FROM insert_log );
What the first file does is pretty self-explanatory.
The other file (load-single.sql) is a bit more complex, so let me explain step by step what it does:
- random_data cte – generates two random integers – id of row in base_data to change, and how much to change the value in base_data – randomized in range -100..100
- data_update cte – updates the row in base_data using random values from random_data
- insert_log cte – inserts row into data_changes log that specifies which row was update, from which value, to which value, and when.
- delete at the end – removes old rows from data_changes – where old is defined as older than latest 10,000 rows
With this in place, I create one more file, but it's content will be boring – single line repeated 1000 times:
postgres@h3po4:~$ seq 1 1000 | sed 's/.*/\\i load-single.sql/' > load-multiple.sql
With this in place, I start 2 concurrent copies of:
postgres@h3po4:~$ while true; do echo "$( date ) : $( /usr/bin/time -f %E psql -U test -qAtX -f load-multiple.sql 2>&1 )"; done
This will generate some load on the database, and show me simple stats, like:
Tue Nov 8 15:18:01 CET 2016 : 0:03.63 Tue Nov 8 15:18:05 CET 2016 : 0:03.70 Tue Nov 8 15:18:09 CET 2016 : 0:03.57 Tue Nov 8 15:18:12 CET 2016 : 0:03.64
Now, let's setup pglogical. The first step, is obviously installing it.
Now, it's time to install the magic tool: pglogical. It can be downloaded by apt, so let's try. On both servers:
=$ echo "deb [arch=amd64] http://packages.2ndquadrant.com/pglogical/apt/ $( lsb_release -cs )-2ndquadrant main" | sudo tee /etc/apt/sources.list.d/pglogical.list deb [arch=amd64] http://packages.2ndquadrant.com/pglogical/apt/ trusty-2ndquadrant main =$ wget --quiet -O - http://packages.2ndquadrant.com/pglogical/apt/AA7A6805.asc | sudo apt-key add - OK =$ sudo apt-get update
Now, depending on which box I'm running it on, I do either:
root@h3po4:~# apt-get install postgresql-9.5-pglogical
or
root@krowka:~# apt-get install postgresql-9.6-pglogical
(to remind: h3po4 is 9.5 host, and krowka has pg 9.6.
Now, I need some setup on 9.5 instance:
postgres@h3po4:~$ psql -d test -c "select name, setting from pg_settings where name in ('wal_level', 'max_worker_processes', 'max_replication_slots', 'max_wal_senders', 'shared_preload_libraries');" name | setting --------------------------+----------- max_replication_slots | 10 max_wal_senders | 10 max_worker_processes | 8 shared_preload_libraries | pglogical wal_level | logical (5 ROWS)
on 9.6 I wouldn't need it all – max_replication_slots and max_wal_senders are not required, but since I can set them up already, there is no reason to skip it already:
postgres@krowka:~$ psql -c "select name, setting from pg_settings where name in ('wal_level', 'max_worker_processes', 'max_replication_slots', 'max_wal_senders', 'shared_preload_libraries');" name | setting --------------------------+----------- max_replication_slots | 10 max_wal_senders | 10 max_worker_processes | 8 shared_preload_libraries | pglogical wal_level | logical (5 ROWS)
Now, I need to make sure all global objects (users and tablespaces generally) are copied over, so I could go with pg_dumpall -g on 9.5, transfer it to 9.6, and load there, but since I have (for now) just one user, and one database, I can simply:
postgres@krowka:~$ createuser test postgres@krowka:~$ createdb -O test test
With this in place, I have both user and database created in both places, but tables, and data are only in 9.5.
Logical replication will require special user, and this user also has to be superuser. So let's make it on both servers:
postgres@h3po4:~$ createuser -s --replication logical_replication postgres@krowka:~$ createuser -s --replication logical_replication
Since this is just test, and I don't really care about security, I'll just make it possible to log using logical_replication with trust:
local replication logical_replication trust host replication logical_replication 127.0.0.1/32 trust host replication logical_replication 0.0.0.0/0 trust local test logical_replication trust host test logical_replication 127.0.0.1/32 trust host test logical_replication 0.0.0.0/0 trust
Both of my hosts are in 172.28.173.x network, but I'm lazy 🙂
To make next examples more clear, let me show the ips, as I will be using host ip numbers, and not hotnames:
- h3po4 (9.5) – 172.28.173.18
- krowka (9.6) – 172.28.173.3
And now, finally we get to play with pglogical itself:
postgres@h3po4:~$ psql -d test -c 'create extension pglogical' CREATE EXTENSION postgres@krowka:~$ psql -d test -c 'create extension pglogical' CREATE EXTENSION postgres@h3po4:~$ psql -d test -c "SELECT pglogical.create_node( node_name := 'provider', dsn := 'host=172.28.173.18 port=5432 dbname=test user=logical_replication' );" create_node ------------- 3171898924 (1 ROW)
Now, I need to add all tables to the replication. Since my test schema has only two tables, and the point is to replicate them, I can simply:
postgres@h3po4:~$ psql -d test -c "SELECT pglogical.replication_set_add_all_tables('default', '{public}'::text[]);" replication_set_add_all_tables -------------------------------- t (1 ROW)
and now, on slave:
postgres@krowka:~$ psql -d test -c "SELECT pglogical.create_node( node_name := 'subscriber', dsn := 'host=172.28.173.3 port=5432 dbname=test user=logical_replication' );" create_node ------------- 2941155235 (1 ROW) postgres@krowka:~$ psql -d test -c "SELECT pglogical.create_subscription( subscription_name := 'subscription', provider_dsn := 'host=172.28.173.18 port=5432 dbname=test user=logical_replication');" create_subscription --------------------- 2875150205 (1 ROW)
aaand .. nothing.
Well, not really nothing. I see pglogical processes:
postgres@h3po4:~$ ps uxf | grep [l]ogical postgres 6079 0.0 0.0 300632 5640 ? Ss 15:57 0:00 \_ postgres: bgworker: pglogical supervisor postgres 6134 0.0 0.0 301168 9300 ? Ss 15:57 0:00 \_ postgres: bgworker: pglogical manager 16385 postgres 6386 4.6 0.1 304840 15784 ? Ss 15:57 0:19 \_ postgres: wal sender process logical_replication 172.28.173.3(51816) idle
and on 9.6 too:
postgres@krowka:~$ ps uxf | grep [l]ogical postgres 20795 0.0 0.0 311200 5640 ? Ss 15:57 0:00 \_ postgres: 9.6/main: bgworker: pglogical supervisor postgres 20799 0.0 0.0 311720 11124 ? Ss 15:57 0:00 \_ postgres: 9.6/main: bgworker: pglogical manager 16451 postgres 21014 0.0 0.0 318188 16332 ? Ss 15:57 0:00 \_ postgres: 9.6/main: bgworker: pglogical apply 16451:2875150205
but the tables are not there on 9.6. And since there is no tables, there is no data.
Then I remembered a thing that is in docs
The event trigger facility can be used for describing rules which define replication sets for newly created tables.
This suggests that if I'll just do, what I did, it will not work, because DDL (creation of tables) is by default not replicated.
So, I dropped replication, dropped the test database on 9.6, recreated it, but then did in there, from test account:
CREATE TABLE base_data ( id serial PRIMARY KEY, VALUE INT4 NOT NULL DEFAULT 0 ); CREATE TABLE data_changes ( id serial PRIMARY KEY, data_id INT8 NOT NULL REFERENCES base_data (id), changed_when timestamptz, change_from INT4, change_to INT4 );
That is create table commands from create.sql.
Then, I repeated:
postgres@krowka:~$ psql -d test -c 'create extension pglogical' postgres@krowka:~$ psql -d test -c "SELECT pglogical.create_node( node_name := 'subscriber', dsn := 'host=172.28.173.3 port=5432 dbname=test user=logical_replication' );" postgres@krowka:~$ psql -d test -c "SELECT pglogical.create_subscription( subscription_name := 'subscription', provider_dsn := 'host=172.28.173.18 port=5432 dbname=test user=logical_replication');"
And almost immediately I saw the data:
postgres@krowka:~$ psql -U test -c 'select count(*) from base_data;' COUNT ------- 1000 (1 ROW)
So, let's see if the data is replicated correctly. I did stop load generator scripts, and waited for replication lag to drop to 0:
postgres@h3po4:~$ psql -d test -c "select pg_current_xlog_location() - replay_location from pg_stat_replication where usename = 'logical_replication' and client_addr = '172.28.173.3'" ?COLUMN? ---------- 0 (1 ROW)
Now, let's see some basic stats, and perhaps some simple data comparison:
postgres@h3po4:~$ printf "select count(*), sum(value) from base_data; select count(*), sum(data_id), sum(change_from), sum(change_to) from data_changes;" | psql -d test count | sum -------+------- 1000 | -6704 (1 row) count | sum | sum | sum -------+---------+--------+-------- 10001 | 4999425 | -24505 | -31308 (1 row)
and on 9.6:
postgres@krowka:~$ printf "select count(*), sum(value) from base_data; select count(*), sum(data_id), sum(change_from), sum(change_to) from data_changes;" | psql -d test COUNT | SUM -------+------- 1000 | -6704 (1 ROW) COUNT | SUM | SUM | SUM -------+----------+--------+-------- 20002 | 10022125 | -10975 | -17040 (1 ROW)
Whoa. That doesn't look good. Data in base_data looks OK, which I can also check with checksums:
postgres@h3po4:~$ psql -d test -qAtX -c "select * from base_data order by id" | md5sum - 958651d1b69a1aff7d7cd6b71622765d - postgres@krowka:~$ psql -d test -qAtX -c "select * from base_data order by id" | md5sum - 958651d1b69a1aff7d7cd6b71622765d -
But why is data_changes wrong?
Quick sanity check:
postgres@h3po4:~$ psql -d test -c "select min(id), max(id) from data_changes" MIN | MAX ---------+--------- 1397920 | 1407920 (1 ROW) postgres@krowka:~$ psql -d test -c "select min(id), max(id) from data_changes" MIN | MAX ---------+--------- 1105628 | 1407920 (1 ROW) postgres@krowka:~$ printf "select count(*), sum(data_id), sum(change_from), sum(change_to) from data_changes where id >= 1397920;" | psql -d test COUNT | SUM | SUM | SUM -------+---------+--------+-------- 10001 | 4999425 | -24505 | -31308 (1 ROW)
OK. So, it looks that last 10001 records are OK, but apparently older rows were not removed.
So, I checked pglogical data:
postgres@h3po4:~$ psql -d test -c "select * from pglogical.replication_set" set_id | set_nodeid | set_name | replicate_insert | replicate_update | replicate_delete | replicate_truncate ------------+------------+---------------------+------------------+------------------+------------------+-------------------- 495597167 | 3171898924 | DEFAULT | t | t | t | t 1269193154 | 3171898924 | default_insert_only | t | f | f | t 3925901991 | 3171898924 | ddl_sql | t | f | f | f (3 ROWS)
and on 9.6:
postgres@krowka:~$ psql -d test -x -c "select * from pglogical.subscription" -[ RECORD 1 ]--------+-------------------------------------- sub_id | 2875150205 sub_name | subscription sub_origin | 3171898924 sub_target | 2941155235 sub_origin_if | 202885847 sub_target_if | 2467722369 sub_enabled | t sub_slot_name | pgl_test_provider_subscription sub_replication_sets | {DEFAULT,default_insert_only,ddl_sql} sub_forward_origins | {ALL}
It kinda worried me that default_insert is included, since it doesn't replicate deletes (it shouldn't also replicate updates, so I'm not sure what's going on there).
So, let's clean, and redo:
postgres@krowka:~$ psql -d test -c "select pglogical.drop_subscription('subscription');" drop_subscription ------------------- 1 (1 ROW) postgres@krowka:~$ psql -d test -c 'truncate base_data cascade' NOTICE: TRUNCATE cascades TO TABLE "data_changes" TRUNCATE TABLE
and now, let's re-add the subscription, but this time make sure that only “default" is replicated:
postgres@krowka:~$ psql -d test -c "SELECT pglogical.create_subscription( subscription_name := 'subscription', provider_dsn := 'host=172.28.173.18 port=5432 dbname=test user=logical_replication', replication_sets := '{default}'::text[] );" create_subscription --------------------- 2875150205 (1 row)
Start load-generators, wait couple of minutes, stop them, wait for replication lag to drop to zero, and recheck data:
postgres@h3po4:~$ printf "select count(*), sum(value) from base_data; select count(*), sum(data_id), sum(change_from), sum(change_to) from data_changes;" | psql -d test COUNT | SUM -------+------ 1000 | -313 (1 ROW) COUNT | SUM | SUM | SUM -------+---------+--------+-------- 10001 | 4988452 | 170618 | 171594 (1 ROW) postgres@krowka:~$ printf "select count(*), sum(value) from base_data; select count(*), sum(data_id), sum(change_from), sum(change_to) from data_changes;" | psql -d test COUNT | SUM -------+------ 1000 | -313 (1 ROW) COUNT | SUM | SUM | SUM -------+---------+--------+-------- 10001 | 4988452 | 170618 | 171594 (1 ROW)
Better. Now. Next step – what about sequences? Sequence on base_data doesn't change, but on data_changes is used quite a lot, so:
postgres@h3po4:~$ psql -d test -c "select * from data_changes_id_seq" sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- data_changes_id_seq | 1471710 | 1 | 1 | 9223372036854775807 | 1 | 1 | 19 | f | t (1 row) postgres@krowka:~$ psql -d test -c "select * from data_changes_id_seq" sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- data_changes_id_seq | 1 | 1 | 1 | 9223372036854775807 | 1 | 1 | 0 | f | f (1 row)
OK. So it looks that sequences are not replicated, at least not by default.
Luckily docs mention interesting function: pglogical.replication_set_add_all_sequences(), so let's try to use it:
postgres@h3po4:~$ psql -d test -c "select pglogical.replication_set_add_all_sequences( set_name := 'default', schema_names := '{public}'::text[], synchronize_data := true )" replication_set_add_all_sequences ----------------------------------- t (1 ROW)
The synchronize_data option, from my understanding, makes sure that sequences will be synchronized immediately – if it was set to false, it would happen at some random point in time, as explained in docs:
The state of sequences added to replication sets is replicated periodically and not in real-time.
After I added sequences to set:
postgres@krowka:~$ psql -d test -c "select * from data_changes_id_seq" sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- data_changes_id_seq | 1471710 | 1 | 1 | 9223372036854775807 | 1 | 1 | 0 | f | t (1 ROW)
Given the periodical approach to synchronize sequences, if you want to switch to current subscriber, it might be good idea to run, after there are no app connections to old primary:
postgres@h3po4:~$ psql -d test -c "select pglogical.synchronize_sequence( seqoid ) from pglogical.sequence_state" synchronize_sequence ---------------------- t t (2 ROWS)
So that pglogical will get current values of all subscribed sequences, and push them to all subscribers.
Based on my tests, it looks that pglogical replicates sequence state every 60-70 seconds. Not bad, but obviously, when switching over to replica you have to keep that in mind.
So, let's see how does replicating ddl work. First, let's try something simple:
postgres@h3po4:~$ psql -d test -c "create table test as select i from generate_series(1,10) i" SELECT 10
After a while – new table is not replicated. But that was to be expected. Let's drop it, and try to use event triggers to replicate new table creation:
postgres@h3po4:~$ psql -d test -c "drop table test"
Then, I use this SQL script:
-- replicate-new-tables.sql CREATE OR REPLACE FUNCTION pglogical_assign_repset() RETURNS event_trigger AS $$ DECLARE obj record; BEGIN FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() LOOP IF obj.object_type = 'table' THEN IF obj.schema_name = 'public' AND NOT obj.in_extension THEN PERFORM pglogical.replication_set_add_table('default', obj.objid); END IF; END IF; END LOOP; END; $$ LANGUAGE plpgsql; CREATE EVENT TRIGGER pglogical_assign_repset_trg ON ddl_command_end WHEN TAG IN ('CREATE TABLE', 'CREATE TABLE AS') EXECUTE PROCEDURE pglogical_assign_repset();
and now, let's load it:
postgres@h3po4:~$ psql -d test -f replicate-new-tables.sql CREATE FUNCTION CREATE EVENT TRIGGER
It's pretty cool that you can write your own logic in the trigger function, so that you could, for example, have separate replication_sets for various schema, and add new table to proper set.
Now, that this is done, let's see if adding the table will replicate it:
postgres@h3po4:~$ psql -d test -c "create table test as select i from generate_series(1,10) i" ERROR: TABLE test cannot be added TO replication SET DEFAULT DETAIL: TABLE does NOT have PRIMARY KEY AND given replication SET IS configured TO replicate UPDATEs AND/OR DELETEs HINT: ADD a PRIMARY KEY TO the TABLE CONTEXT: SQL statement "SELECT pglogical.replication_set_add_table('default', obj.objid)" PL/pgSQL FUNCTION pglogical_assign_repset() line 8 at PERFORM
Oh, interesting. I can't add tables without primary key. Makes sense, I guess. So, let's try different way:
postgres@h3po4:~$ psql -d test -c "create table test ( id serial primary key, payload text )" CREATE TABLE
This worked, but the table was not created on 9.6 subscriber. Let's see if if it will replicate data if I'll create it manually, and then insert some rows:
postgres@krowka:~$ psql -d test -c "create table test ( id serial primary key, payload text )" CRETE TABLE postgres@h3po4:~$ psql -d test -c "insert into test (payload) select i from generate_series(1,5) i" INSERT 0 5 postgres@h3po4:~$ psql -d test -c "select * from test" id | payload ----+--------- 1 | 1 2 | 2 3 | 3 4 | 4 5 | 5 (5 ROWS) postgres@krowka:~$ psql -d test -c "select * from test" id | payload ----+--------- 1 | 1 2 | 2 3 | 3 4 | 4 5 | 5 (5 ROWS)
It worked! But manually adding all the tables to subscriber side can be problematic.
Docs say:
Automatic DDL replication is not supported. Managing DDL so that the provider and subscriber database(s) remain compatible is the responsibility of the user. pglogical provides the pglogical.replicate_ddl_command function to allow DDL to be run on the provider and subscriber at a consistent point.
So, let's try it. I'll make new table, add some data, and will try to replicate it to subscribers:
postgres@h3po4:~$ psql -d test -c "select pglogical.replicate_ddl_command('create table public.test2 ( id serial primary key, payload text )', '{default}'::text[])" replicate_ddl_command ----------------------- t (1 ROW) postgres@h3po4:~$ psql -d test -c "insert into test2 (payload) select repeat('test', i) from generate_series(1,3) i" INSERT 0 3
after a while, on 9.6 instance:
postgres@krowka:~$ psql -d test -c '\d test2' TABLE "public.test2" COLUMN | TYPE | Modifiers ---------+---------+---------------------------------------------------- id | INTEGER | NOT NULL DEFAULT NEXTVAL('test2_id_seq'::regclass) payload | text | Indexes: "test2_pkey" PRIMARY KEY, btree (id) postgres@krowka:~$ psql -d test -c 'select * from test2' id | payload ----+-------------- 1 | test 2 | testtest 3 | testtesttest (3 ROWS)
While checking how the replication is working, I noticed something:
postgres@h3po4:~$ psql -d test -c "select seqoid::regclass from pglogical.sequence_state" seqoid --------------------- base_data_id_seq data_changes_id_seq (2 ROWS)
Only the first two sequences are added to replication_set, but we now have more of them:
postgres@h3po4:~$ psql -d test -c '\ds' List OF relations Schema | Name | TYPE | Owner --------+---------------------+----------+---------- public | base_data_id_seq | SEQUENCE | test public | data_changes_id_seq | SEQUENCE | test public | test2_id_seq | SEQUENCE | postgres public | test_id_seq | SEQUENCE | postgres (4 ROWS)
This is simply because the trigger that I copy/paste from documentation handled only create table events.
Let's fix it. First, let's add the sequences that are already there:
postgres@h3po4:~$ psql -d test -c "select pglogical.replication_set_add_all_sequences( set_name := 'default', schema_names := '{public}'::text[], synchronize_data := true )" replication_set_add_all_sequences ----------------------------------- t (1 ROW) postgres@h3po4:~$ psql -d test -c "select seqoid::regclass from pglogical.sequence_state" seqoid --------------------- base_data_id_seq data_changes_id_seq test2_id_seq test_id_seq (4 ROWS)
OK, Now, let's see about this trigger:
-- replicate-new-tables-and-sequences.sql BEGIN; DROP event TRIGGER IF EXISTS pglogical_assign_repset_trg; CREATE OR REPLACE FUNCTION pglogical_assign_repset() RETURNS event_trigger AS $$ DECLARE obj record; BEGIN FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() WHERE command_tag IN ('CREATE TABLE', 'CREATE TABLE AS', 'CREATE SEQUENCE') LOOP IF obj.schema_name = 'public' AND NOT obj.in_extension THEN IF obj.object_type = 'table' THEN PERFORM pglogical.replication_set_add_table('default', obj.objid); ELSIF obj.object_type = 'sequence' THEN PERFORM pglogical.replication_set_add_sequence('default', obj.objid); END IF; END IF; END LOOP; END; $$ LANGUAGE plpgsql; CREATE EVENT TRIGGER pglogical_assign_repset_trg ON ddl_command_end WHEN TAG IN ('CREATE TABLE', 'CREATE TABLE AS', 'CREATE SEQUENCE') EXECUTE PROCEDURE pglogical_assign_repset(); commit;
And of course, it needs to be loaded, on master (9.5):
postgres@h3po4:~$ psql -d test -f replicate-new-tables-and-sequences.sql BEGIN DROP EVENT TRIGGER CREATE FUNCTION CREATE EVENT TRIGGER COMMIT
With this in place, when I'll add new table:
postgres@h3po4:~$ psql -d test -c "select pglogical.replicate_ddl_command('create table public.test3 ( id serial primary key, payload text )', '{default}'::text[]);" CREATE TABLE
Both the table, and its sequence are immediately added to replication set:
postgres@h3po4:~$ psql -d test -c "select * from pglogical.replication_set_relation where set_reloid::text ~ 'test3'" set_id | set_reloid -----------+-------------- 495597167 | test3_id_seq 495597167 | test3 (2 ROWS)
What's left? Well, not much. In case we'd like to have this replication going for longer time, we would probably want to add handling of dropping of objects and truncation of tables.
But as I see it now – for upgrade, you generally need to:
- setup provider
- subscriber
- replicate data
- wait for replication lag to drop to 0
- stop application
- update sequences state
- remove pglogical setup from both nodes
- stop old master
- make changes so that app will connect to new master
- start app
Downtime is only from step #5 to step #10, but it shouldn't really take significant time.
The “remove pglogical setup" is pretty simple on its own – you need to stop replicating, then remove nodes, and remove extension itself:
postgres@krowka:~$ psql -d test -c "select pglogical.drop_subscription('subscription', true)" postgres@krowka:~$ psql -d test -c "select pglogical.drop_node('subscriber', true)" postgres@krowka:~$ psql -d test -c "drop extension pglogical cascade"
The last command will provide information about objects that were dropped by cascade, but these should be only some triggers that are internal to pglogical (though, on normal tables).
Of course, the whole process should be tested, extensively, with your schema and data. But it does look very promising.
Thank you A LOT for this post. Recently I’ve tried to configure pglogical but didn’t succeeded since unfortunately official documentation is very confusing and unclear whether some command should be executed on provider or subscriber, etc. I opened a corresponding issue https://github.com/2ndQuadrant/pglogical/issues/56 Sadly it doesn’t look like 2nd Quadrant guys care much about writing a good documentation.
s/useruser/superuser/
@Douglas:
thanks, fixed.
If you set instant replication of sequences and replication lag is 0, in theory you should be able to just switch the app to the new DB, then after that remove pglogical? Or am I missing something?
@Kustodian:
I wouldn’t really risk it. If your app is busy enough there is no such thing as immediate switch.
Why not pause the database pool(s) in the bouncer for the ~ 10 seconds to do the switch?
Hi,
Thanks for the article, its really helpful.
Our production have database->schema(more than 300)->table-> rows architecture, when I am implementing pglogical, do i need to crate database->schema->table structure in 9.6 environment? I am planning to upgrade from 9.4 to 9.6.
Thanks
-Ashish
@Ashish:
yes – you need to copy schema yourself. This can be done using pg_dump -s, and then loading generated sql to 9.6 with psql.
hi when try to apply last step I can not success. but I put pg_hba entry with all case 🙁
psql -U postgres -d test -c “SELECT pglogical.create_subscription( subscription_name := ‘subscription’, provider_dsn := ‘host=172.16.150.149 port=5432 dbname=test user=logical_replication password=postgres’);”
ERROR: could not connect to the postgresql server: FATAL: no pg_hba.conf entry for host “172.16.150.151”, user “logical_replication”, database “test”, SSL off
DETAIL: dsn was: host=172.16.150.151 port=5432 dbname=test user=logical_replication
@mehmet:
you need to add appropriate line in pg_hba.conf on master, and then reload config there.
Most likely the line will look like:
Possibly also
thank you it is solved this extension is very nice..
hi I am failing to execute the subscription step .
query :
select pglogical.create_subscription(subscription_name:=’subscription’,provider_dsn:=’host=172.17.27.150 port=5432 dbname=db’);
error :
could fetch remote node info: ERROR: schema “pglogical” does not exist
LINE 1: …, node_name, sysid, dbname, replication_sets FROM pglogical….
but when i checked i have schema named pglogical.
few of the changes i followed while installing :
1)I did not create any separate user. I used postgres
user only.
2)I installed pglogical using rpm files.
provider IP : 172.17.27.150
subscriber IP : 172.17.27.151
@Daksha:
clealry pglogical schema is not there. why? no idea. you said that it is, but the error clearly contradicts you. so either postgresql suddently became magical, or you’re doing something wrong, but I can’t tell you what or how, since I don’t know *any* details.
If you want help, then it’s usually much better to use some form of real-time talk – like irc channel, and not blog post comments section.
Hi
How do pglogical works on partition tables?
I have parent table without pk column in it and it’s child tables have pk column on them.
I have added an extra column to the parent table and created pk by updating the column with unique values now the challenge is how to replicate the data from provider to subscriber? as there is pk on different column there might be null values in child table’s new added column and pglogical will do “copy to select * from “. now, parent tables not allow null values and breaking the replication.
Please suggest me if you have any other solution to add partition tables in pglogical
Excellent article and trigger. Thank you.
Having trouble with the last replicate_ddl_command, it ran successfully. I see the table and data in the provider. I see the table in replication_set_table. However i do not see the table in the subscriber.
@Vik:
if you need help I suggest that you contact some support channel for pglogical. I just wrote about one particular usecase, am not developer or (at the moment) even user of pglogical, so can’t really help.