Skip site navigation (1) Skip section navigation (2)
PLEASE NOTE: This project is dead. The website is kept online for historical reasons.

Peripheral Links

Header And Logo

Postgres
| The world's most advanced open source database.

Site Navigation

Terms and Definitions for Database Replication (2)

Synchronous vs Asynchronous Replication

According to Wikipedia's definition of synchronization, database replication is considered data synchronization, as opposed to process synchronization. However, internally a database systems has to synchronize locks or at least transactions, which would clearly be considered process synchronization.

Within the context of database replication, the most common definition of synchronous replication is, that as soon as a transaction is confirmed to be committed, all nodes of a cluster must have committed the transaction as well. This is very expensive in terms of latency and amount of messages to be sent, but it prevents divergence. In asynchronous replication systems, other nodes can apply the transactional data at any later point, thus the nodes may serve different, possibly even conflicting snapshots of the database.

Eager vs Lazy Replication

The terms eager and lazy are more often used in the literature about database replication. Sometimes synonymously to synchronous or asynchronous replication. Other times, there's a nifty difference: instead of stating, that all nodes need to process transactions synchronously, it's often sufficient to state that the nodes or replicas are eventually kept coherent. This means that transactions must be applied in the very same order, but not necessarily synchronously. According to that definition, eager replication is somewhere between sync and async: it allows nodes to lag behind others while still preventing divergence.

Another way to think about this distinction is, that eager systems have to replicate the transactional data before confirming the commit, while lazy systems replicate that data at some time after committing, so conflicts can arise.

Lazy replication is always asynchronous and does not prevent divergence. Due to this tiny but important difference, I prefer the term lazy to asynchronous. In any case however, the process of exchanging data to detect and resolve conflicts is called reconciliation. How often to reconciliate is a tradeoff between the allowed lag time and the load for reconciliation. It's also important to know, that lazy systems may reduce the probability of divergence by reconciliating more often, but the possibility of divergence cannot be eliminated completely (otherwise the system would be eager).

Distributed Querying

Distributed querying allows a single query to use multiple servers. This improves performance for long running read-only transactions. Short and simple queries should better be answered from a single node. The Postgres documentation speaks of "Multi-Server Parallel Query Execution".

<
>

Project hosted by bluegap | Designed by Ronja Wanner and tinysofa
Copyright © 1996 – 2010 Postgres Global Development Group