Data Replication

Replicating and synchronizing data is a common challenge when deploying applications across multiple datacenters, including cloud and on-premises locations. We'll explore the various strategies and considerations for data replication and synchronization to maximize availability and performance, ensure consistency, and minimize data transfer costs.

Master-Master Replication and Master-Subordinate Replication are two common data replication topologies. In Master-Master Replication, data in each replica is dynamic and can be updated, and requires a two-way synchronization mechanism to keep the replicas up to date and to resolve any conflicts that might occur. On the other hand, in Master-Subordinate Replication, the data in only one of the replicas is dynamic (the master), and the remaining replicas are read-only. The synchronization requirements for this topology are simpler because conflicts are unlikely to occur.

To improve performance and scalability, it is recommended to use Master-Subordinate replication with read-only replicas to improve the performance of queries. It is also important to locate the replicas close to the applications that access them and use simple one-way synchronization to push updates to them from a master database. In contrast, Master-Master replication is more suitable for improving the scalability of write operations. This is because applications can write more quickly to a local copy of the data, but there is additional complexity because two-way synchronization (and possible conflict resolution) with other data stores is required. To avoid the requirement to cross the network to another datacenter, it is recommended to include in each replica any reference data that is relatively static and required for queries executed against that replica. For example, postal code lookup tables (for customer addresses) or product catalog information (for an ecommerce application) can be included in each replica.

To improve reliability and security, it is recommended to version the data so that no overwriting is required. When data is changed, a new version is added to the data store alongside the existing versions. Applications can access all the versions of the data and the update history, and can use the appropriate version. Additionally, using a quorum-based approach where a conflicting update is applied only if the majority of data stores vote to commit the update can also help to improve reliability. If the majority votes to abort the update, then all the data stores must abort the update. Quorum-based mechanisms are not easy to implement but may provide a workable solution if the final value of conflicting data items should be based on a consensus rather than being based on the more usual conflict resolution techniques such as “last update wins” or “master database wins.”

When deciding the frequency of synchronization, it is important to strike a balance between update conflicts, data staleness, heavy network loads, and increased data transfer costs. Most synchronization frameworks and services perform the synchronization operation on a fixed schedule, but it may be possible to propagate changes across replicas as they occur by using background tasks that synchronize the data.

It is also important to decide which data store will hold the master copy of the data where this is relevant, and the order in which updates are synchronized when there are more than two replicas of the data. Additionally, it is important to consider how to handle the situation where the master database is unavailable. It may be necessary to promote one replica to the master role in this case.

Finally, beware of creating a synchronization loop in a system that implements the Master-Master replication topology. Synchronization loops can arise if one synchronization action updates a data store and this update prompts another synchronization that tries to apply the update back to the original data store. Synchronization loops can also occur when there are more than two data stores, where a synchronization update travels from one data store to another and then back to the original one.