OmniPITR project that I wrote about some time ago is going on.
Just today I finished tests for omnipitr-backup-slave – part of OmniPITR which lets you make hot-backups of WAL-slave machine – without any additional load on master.
As previously – please download (svn co) and test. In case you have problems – please mail me or contact me on irc.freenode.net – I'm usually on #postgresql.
Hallo,
I have gotten an issue with the hot-slave backup when using your approach of slave backup.
http://groups.google.com/group/pgsql.bugs/browse_thread/thread/411a27d0afa06963?fwc=1
Do you have any comments on the issue? I suppose, that this issue is also related to omnipitr-backup-slave, as I basically copied the behavior from it, when making a grandchild database.
With my best regards,
— Valentine
@Valentine
not sure what could have go wrong – please check source of omnipitr-backup-slave – perhaps you will notice some step that it takes, and you do not. Sorry I’m not more helpful, but I just don’t know what could have not worked there.
Hm… actually I was looking in your script to build the backup procedure and practically copied the wait for checkout location procedure in bash.
Did you have a chance to check the consistency of the database, restored from the backup done by omnipitr-backup-slave? We found the problem with indexes by accident actually when some of the records could not have been found in some table…
— Valentine
@Valentine:
of course.
we made a lot of slave backups, and we test them a lot. this included 400GB database, with really heavy traffic.
Hi again, I hope I am not bothering you too much 🙂
can that be, that all that databases are getting full_page written WAL files? Our main database is configured with full_page_writes set to off. Maybe that is the difference that leads to the problems with indexes?
@Valentine:
Of course we use full page writes. Turning it off is dangerous, and can lead to data loss.
Probably that is our problem…
The issue with the full_page written WAL files for us was that on the normal operation, we have such a big full_page written WAL stream, that the standby database was not able to replay them on the machine with identical parameters 🙁 And I could not find the way to speed up replay or to reduce the WAL files steam other then turning full_page_write to off… 🙁