So, we’re at the Pub and doing “create a billion tables” time trials with Jan Urbanski using Python and Josh Berkus using Perl.
We’re also hacking on a test framework the Slony developers have, specifically hacking with Steve Singer. What we discovered is that sync rep doesn’t wait for a WAL segment to be *replayed* before it returns. In the pg_stat_replication table, we see sent_location, write_location and flush_location synchronized, but not replay_location.
This makes sense from a database perspective, but may be surprising behavior for application developers. There are patches out there (according to what I just heard from Bernd) to make synchronous replication wait for replay on the slave, but it’s not certain when that will be committed. It definitely won’t be part of version 9.1.
I just wrote up configuration details from a database administrator’s perspective, and am planning on doing some additional work to make a highly condensed configuration tutorial for our main docs. We definitely need to explain this more clearly for users, who might be thinking of it more from an application perspective.
Replay_location *is* part of pg_stat_replication. The information is fully available, but there is no technical reason why it is not used – it was always intended to be so, and the patch submitted to the last commit fest contained that.
The “missing feature” was part of the original patch but the commit was blocked on a technicality, since it was more than one month since the start of the commit fest or some other excuse. It was a useful feature, not a bug. I have had a patch ready for about 2 months now, which is minor and can be added any time we choose. If you want it in 9.1, make some noise now. If not, I’ll apply it as soon as dev window opens for 9.2. I’ve been waiting for beta feedback to see whether people thought this feature should be added.
Hey Simon,
Thanks for reading. 🙂
What I said was “we see sent_location, write_location and flush_location *synchronized*, but not replay_location” (emphasis added).
I see what was intended, and I am not sure that I *want* that patch applied for 9.1. But I will think more about it.
I think the performance tradeoff is a good one, but we need to make an effort to communicate to application developers what we mean by synchronous.
-selena
Pingback: Postgres OnLine Journal