Comment Re:How does it deal with replication latency? (Score 1) 137
You're right that this increases cost for round trips, but it's not nearly as bad for _throughput_ as you imply. Applications pipeline, and TCP is windowed, so you don't have a round trip per packet or message. Above, where you imply a round trip for each row returned from a table, the actual behaviour is that the client sends a request, and the server immediately starts spitting out a large number of rows. There's a delay of on average your checkpoint length/2 for the initial response to arrive (so 25ms if you're checkpointing at 50ms), then the rest of the query arrives at link speed.
Remus is meant for applications that are separated by a WAN/internet link from their clients (internet applications generally have to tolerate this degree of latency anyway). For multi-server cooperative applications (like a web server in front of a database), you could get rid of the network delay by checkpointing both servers and failing them over as a unit. We have some experimental cluster checkpoint code here, but it's not part of this release.