Skip to content

XDR operations

Cross Datacenter Replication (XDR) enables data synchronization across Aerospike clusters. While powerful, it introduces additional resource considerations that vary based on traffic, topology, and configuration.

How does XDR impact the network?

XDR introduces network traffic between clusters, resulting in the following:

  • The amount of traffic can be much higher than the incoming network load from clients recovering after a back off.
  • Every write, replicated only in strong consistency mode, is sent across the network to the destination DCs.
  • Bandwidth usage increases with write volume and the number of destinations.

Does XDR increase storage load?

XDR typically does not increase storage I/O load under normal conditions. It leverages the post-write-cache to read records from memory without hitting storage. However, if XDR falls behind or is recovering, it may do the following:

  • Read records from storage as it scans partitions as part of the recovery process
  • Cause unexpected I/O load during backlog catch-up

How does XDR consume memory?

XDR temporarily stores a record’s digest and LUT (Last Updated Time) in memory queues, using 25 bytes per entry. Memory is consumed as follows:

  • Per partition (cluster size and replication factor determine the number of partitions that each node owns)
  • Per namespace
  • Per destination DC

These queues are capped by transaction-queue-limit.

Does XDR affect CPU usage?

  • XDR typically compresses data to optimize network usage, increasing CPU load.
  • When you have multiple DCs for a single namespace, each destination DC receives its own independently compressed stream. Adding another DC to the same namespace increases CPU usage.
  • CPU pressure may be noticeable when:
    • High write throughput is combined with multiple DCs
    • A new DC is added

Best practices summary

  • Monitor network throughput and ensure sufficient capacity for inter-DC traffic.
  • Watch for CPU utilization spikes when adding DCs or increasing replication volume.
  • Use post-write-cache to avoid storage read amplification, but be aware of recovery scenarios.
  • Tune transaction-queue-limit if queue pressure is observed, but be aware of memory limits.
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?