Datacenter replication management
Overview
The Cross-Datacenter Replication (XDR) service is built into the main Aerospike service (asd). It always starts and stops when the Aerospike server is started and stopped.
Resume XDR from a previous execution [Default]
Starts XDR and will resume of shipping of the log from start of the log. Along with shipping data of current node it also takes responsibility of shipping of data for the nodes which are not present in the cluster currently. It will be able to ship data of a node which did not come up only if the digest log on this node has the shadow logs of it. This should be used when multiple nodes have gone down and all could not be brought back online. If the master and replica nodes corresponding to some data cannot be brought back online for any reason, obviously the data cannot be shipped because both the copies of the corresponding log will not be processed.
/etc/init.d/aerospike_xdr resume
Resume XDR but don't ship for others
Starts XDR with resumption of shipping of log from the point the last failure happened. However, the XDR will not take the responsibility of shipping on behalf of the nodes that did not come up. This should be used when a certain node(s) goes down and is brought up again and it is known that the shipping for the replica need not be done. If all the nodes are restarted each of them takes care of its own logs, all the log records should be covered. This is useful when doing a rolling upgrade or planned shutdown of the entire cluster when it is sure that all the nodes that are taken down can be brought up again.
XDR needs to be fault tolerant. So, when a machine is shutdown (or has a system failure), it should not lead to loss of data that is supposed to be shipped to the remote cluster. To handle this case, XDR writes the corresponding log records in the digest log (shadow log) at the replica site also along with the data. i.e, XDR records the information about what data has to be shipped to the remote cluster at the replica site also. But XDR does not ship using the shadow log unless the master node failed.
After a node fails, its replica nodes take over the responsibility of shipping the data that this node is earlier responsible for. This way XDR ensures that there is no data loss during a single node failure.
/etc/init.d/aerospike_xdr resume-nofailover
Start XDR with a clear backlog
Starts XDR afresh. After XDR starts with noresume all the old log records in the digest log are lost. This should be done when doing a fresh start. This can also be used in the event of a single node failure when that failed node is being restarted. When a single node fails, the responsibility of shipping the records belonging to it is taken over by XDR on a different node in the cluster. So, the XDR on the failed node, when it restarts, need not resume from the point at which it left. It can ignore them and continue to ship only the freshly written data.
/etc/init.d/aerospike_xdr noresume
Stop XDR
To shutdown the XDR service use the stop
command:
/etc/init.d/aerospike_xdr stop
Get Running Status of XDR
To determine if the XDR service is currently running, use the status
command:
/etc/init.d/aerospike_xdr status