Cluster consistency
This page describes how to manage the cluster nodes of a strong consistency (SC) namespace in the Aerospike Database.
Add nodes to the cluster and roster
Use the asadm manage roster & show roster
commands. Otherwise, use the equivalent asinfo - roster and asinfo - roster-set commands.
-
Install and configure Aerospike on the new nodes as described in Configure strong consistency.
-
When the nodes have joined the cluster, use the following command to verify that the result in
cluster_sizeis greater than the result inns_cluster_size. In the following output, all 4 nodes reportcluster_size: 4, but only the 3 existing roster members reportns_cluster_size: 3. The newly added node (node2) reportsns_cluster_size: 0because it is not yet on the roster.Terminal window Admin> show stat -flip like cluster_size~Service Statistics (2026-04-14 00:32:38 UTC)~Node|cluster_sizenode1:3000 | 4node2:3000 | 4node3:3000 | 4node4:3000 | 4Number of rows: 4~test Namespace Statistics (2026-04-14 00:32:38 UTC)~Node|ns_cluster_sizenode1:3000 | 3node2:3000 | 0node3:3000 | 3node4:3000 | 3Number of rows: 4 -
Use
show rosterto see the newly observed nodes in its Observed Nodes section. -
Use the following command to copy the
Observed Nodeslist into thePending Roster.Terminal window Admin> enableAdmin+> manage roster stage observed ns testPending roster now contains observed nodes.Run "manage recluster" for your changes to take affect. -
Activate the new roster with the
manage reclustercommand.Terminal window Admin+> manage reclusterSuccessfully started recluster -
Run
show rosterto confirm that the roster has been updated on all nodes. Verify that the service’scluster_sizematches the namespace’sns_cluster_size.
Remove nodes and update the roster
This section describes how to remove a node from an existing namespace configured with SC.
Remove node from the cluster
-
Run
show roster. Verify all roster nodes are present in the cluster.Terminal window Admin+> show roster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Roster (2026-04-14 00:32:55 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID|Namespace| Current Roster| Pending Roster| Observed Nodesnode1:3000|*BB978F2CCF9F18A|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FAnode2:3000| BB9547249D8CE66|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FAnode3:3000| BB919C6ABA304FA|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FAnode4:3000| BB957376A7ADD8E|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FANumber of rows: 4 -
Safely shut down the nodes to be removed with the following command (in this example, on
node2:3000with node idBB9547249D8CE66). Verify you are removing fewer nodes than your configured RF.Terminal window systemctl stop aerospike -
When migrations are complete, run the
statcommand onpartitions_remaininguntil themigrate_partitions_remainingstat becomes zero on all nodes.Terminal window Admin> show stat service like partitions_remaining -flip~~~~~Service Statistics (2026-04-14 00:33:10 UTC)~~~~Node|migrate_partitions_remainingnode1:3000 | 0node3:3000 | 0node4:3000 | 0Number of rows: 3
Remove node from the roster
-
Run
show rosterto verify that BB9547249D8CE66 is no longer in theObserved Nodes. In the following example there is one fewer node in theObserved Nodescolumn than inCurrent RosterandPending Roster.Terminal window Admin+> show roster~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Roster (2026-04-14 00:34:56 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID|Namespace| Current Roster| Pending Roster| Observed Nodesnode1:3000|*BB978F2CCF9F18A|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB919C6ABA304FAnode3:3000| BB919C6ABA304FA|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB919C6ABA304FAnode4:3000| BB957376A7ADD8E|test |BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB9547249D8CE66,BB919C6ABA304FA|BB978F2CCF9F18A,BB957376A7ADD8E,BB919C6ABA304FANumber of rows: 3 -
Copy the
Observed Nodeslist into thePending Roster.Terminal window Admin+> manage roster stage observed ns testPending roster now contains observed nodes.Run "manage recluster" for your changes to take affect. -
Run
manage reclusterto apply the change.Terminal window Admin+> manage reclusterSuccessfully started recluster -
Check if startup is complete. Run the following command periodically until it returns
ok.Terminal window asinfo -h [ip of host] -v 'status'
Planned maintenance
For rolling upgrades and planned maintenance, such as OS patching or host reboots on SC namespaces, process one node at a time. In multi-AZ rack-aware deployments, you can process one rack at a time.
During planned maintenance, the same nodes return to the cluster after the upgrade or reboot, so the roster does not need to be updated. The roster only changes when you permanently add nodes or remove nodes. If a node fails to rejoin after maintenance and must be replaced, see Remove nodes and update the roster for how to update the roster, and Revive dead partitions if dead partitions result. If a procedure is stuck on a verification step or you encounter unexpected behavior, see Troubleshooting, search the Support Knowledge Base, or open a support case.
Before you begin
Configure migrate-fill-delay on every node to a value that exceeds the expected time for a single node to complete maintenance and rejoin. This suppresses unnecessary “fill” migrations to stand-in (non-roster) replicas while a node is temporarily out of the cluster. In SC namespaces, migrate-fill-delay only affects non-roster replicas; roster replica migrations proceed immediately. See Delay migrations for details.
If you set migrate-fill-delay dynamically, the value reverts to the static configuration on node restart. Since planned maintenance involves restarting nodes, set this value in the configuration file (aerospike.conf) so it persists across restarts.
Upgrading Aerospike (asd restart only, no host reboot)
When only the Aerospike daemon is restarted, for example, during a rolling software upgrade, the node can warm restart because shared memory segments holding the primary index and secondary indexes in shared memory survive. Process one node at a time:
-
Quiesce the node, then trigger a recluster.
Terminal window Admin+> manage quiesce with <node-ip>Admin+> manage reclusterVerify:
show statistics like pending_quiesceshowstrueon the target:Terminal window Admin+> show statistics like pending_quiesce~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2026-04-14 00:28:10 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node |node1:3000|node2:3000|node3:3000|node4:3000pending_quiesce|false |true |false |falseNumber of rows: 2After recluster,
show statistics like quiesceshowseffective_is_quiesced: trueon the target andnodes_quiesced: 1on all nodes:Terminal window Admin+> show statistics like quiesce~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2026-04-14 00:28:20 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node |node1:3000|node2:3000|node3:3000|node4:3000effective_is_quiesced|false |true |false |falsenodes_quiesced |1 |1 |1 |1pending_quiesce |false |true |false |falseNumber of rows: 4 -
Wait for the quiesce handoff to complete. The quiesced node hands off master status. Wait until no active traffic or proxies are hitting the quiesced node.
Terminal window Admin+> show latenciesVerify: ops/sec drops to zero on the quiesced node, then confirm
client_proxy_*andbatch_sub_proxy_*counters stop incrementing on the quiesced node, andfrom_proxy_*counters stop on the remaining nodes. See the quiesce verification reference for full output examples. -
Shut down
asd, perform the upgrade, and restartasd. The node warm restarts.Terminal window $ sudo systemctl stop aerospike# ... perform upgrade ...$ sudo systemctl start aerospikeVerify:
info networkshows the node has rejoined at the expected cluster size.Terminal window Admin> info network~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2026-04-14 00:29:18 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID| IP| Build|Migrations|~~~~~~~~~~~~~~~~~~Cluster~~~~~~~~~~~~~~~~~~|Client| Uptime| | | | |Size| Key|Integrity| Principal| Conns|node1:3000 | BB94B7AEB45DB52|10.0.3.1:3000|E-8.1.1.2| 0.000 | 4|AA5AF50552AF|True |BB9D787F6BAF3D6| 5|00:20:34node2:3000 |*BB9D787F6BAF3D6|10.0.3.2:3000|E-8.1.1.2| 0.000 | 4|AA5AF50552AF|True |BB9D787F6BAF3D6| 5|00:00:14node3:3000 | BB989A1BF1D8116|10.0.3.3:3000|E-8.1.1.2| 0.000 | 4|AA5AF50552AF|True |BB9D787F6BAF3D6| 5|00:20:34node4:3000 | BB9D48DF5A70CEE|10.0.3.4:3000|E-8.1.1.2| 0.000 | 4|AA5AF50552AF|True |BB9D787F6BAF3D6| 5|00:20:34Number of rows: 4 -
Validate that the cluster has no unavailable or dead partitions.
-
Repeat from step 1 for the next node.
Planned maintenance with host reboot
When the host itself is rebooted, shared memory is wiped and the node cold restarts unless the primary index is persisted beforehand. Use ASMT to avoid a cold restart and its side effects, such as potential unreplicated records and zombie records. Process one node at a time:
-
Quiesce the node, then trigger a recluster.
Terminal window Admin+> manage quiesce with <node-ip>Admin+> manage reclusterVerify: same as rolling upgrade step 1.
-
Wait for the quiesce handoff to complete.
Terminal window Admin+> show latenciesVerify: same as rolling upgrade step 2.
-
Shut down
asd.Terminal window $ sudo systemctl stop aerospike -
Back up the indexes of each namespace with
asmt. The-zoption enables compression, which is recommended for planned maintenance.Terminal window $ sudo asmt -b -v -z -p <path-to-backup-directory> -n <ns1, ns2, ...>See Backing up indexes with ASMT for full output details.
-
Reboot the host and perform OS or hardware maintenance.
-
After the host is back, restore the indexes of each namespace with
asmt. The-zoption is not needed; ASMT auto-detects compressed files.Terminal window $ sudo asmt -r -v -p <path-to-backup-directory> -n <ns1, ns2, ...>See Restoring indexes with ASMT for full output details.
-
Start
asd. The node warm restarts from the restored index instead of cold restarting. The Aerospike log confirms this withbeginning warm restartfor each namespace (instead ofbeginning cold start).Terminal window $ sudo systemctl start aerospikeVerify:
info networkshows the node has rejoined at the expected cluster size.Terminal window Admin> info network~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2026-04-14 00:35:42 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID| IP| Build|Migrations|~~~~~~~~~~~~~~~~~~Cluster~~~~~~~~~~~~~~~~~~|Client| Uptime| | | | |Size| Key|Integrity| Principal| Conns|node1:3000 | BB94B7AEB45DB52|10.0.3.1:3000|E-8.1.1.2| 0.000 | 4|3866DA39491B|True |BB9D48DF5A70CEE| 5|00:23:37node2:3000 | BB90B0CA8BD688A|10.0.3.2:3000|E-8.1.1.2| 0.000 | 4|3866DA39491B|True |BB9D48DF5A70CEE| 5|00:00:14node3:3000 | BB989A1BF1D8116|10.0.3.3:3000|E-8.1.1.2| 0.000 | 4|3866DA39491B|True |BB9D48DF5A70CEE| 5|00:23:37node4:3000 |*BB9D48DF5A70CEE|10.0.3.4:3000|E-8.1.1.2| 0.000 | 4|3866DA39491B|True |BB9D48DF5A70CEE| 5|00:23:37Number of rows: 4 -
Validate that the cluster has no unavailable or dead partitions.
-
Repeat from step 1 for the next node.
Rack at a time (multi-AZ rack-aware deployments)
In a rack-aware cluster deployed across multiple availability zones, you can take down a full rack at a time instead of one node at a time, provided the remaining racks can maintain partition availability.
Availability requirements
The remaining racks must hold enough roster replicas to keep all partitions available while the target rack is down. In SC, a partition is available when the surviving nodes form a majority of the roster and at least one roster replica for that partition is among them. This depends on the number of racks and the replication-factor (RF):
-
3+ racks, RF >= number of racks (for example, 3 AZs with RF=3): Every partition has a roster replica on every rack. One rack down leaves a clear majority of roster nodes, each holding a replica. All partitions remain available. No special configuration is needed.
-
3+ racks, RF < number of racks (for example, 3 AZs with RF=2): Each partition has replicas on RF distinct racks. One rack down still leaves a majority of roster nodes, and because rack-aware placement spreads each partition’s replicas across different racks, at least one roster replica for every partition survives. All partitions remain available. Do not take down more than one rack at a time. Losing two of three racks removes the majority and causes unavailable partitions.
-
2 racks (any RF): Taking down one rack leaves exactly half the roster, so there is no majority. Partitions whose roster-master is on the downed rack become unavailable (~50%). Use the
active-rackoptimization below to maintain full availability during planned maintenance.
Procedure
-
Quiesce all nodes in the target rack and trigger a recluster.
Terminal window Admin+> manage quiesce with <node-ip-1>Admin+> manage quiesce with <node-ip-2>...Admin+> manage reclusterVerify:
show statistics like quiesceshowseffective_is_quiesced: trueon the quiesced nodes andnodes_quiescedequals the number of quiesced nodes on all nodes.Terminal window Admin+> show statistics like quiesce~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2026-04-14 00:41:02 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node |node1:3000|node2:3000|node3:3000|node4:3000|node5:3000|node6:3000effective_is_quiesced|false |false |true |true |false |falsenodes_quiesced |2 |2 |2 |2 |2 |2pending_quiesce |false |false |true |true |false |falseNumber of rows: 4 -
Wait for the quiesce handoff. Verify no traffic or proxies reach the quiesced nodes.
Terminal window Admin+> show latenciesVerify: ops/sec drops to zero on all quiesced nodes,
client_proxy_*andbatch_sub_proxy_*counters stop incrementing, andfrom_proxy_*counters stop on the remaining nodes. See the quiesce verification reference for full output examples. -
Recommended: Dynamically change
cluster-nameon the nodes in that rack to a different value. This ejects them from the cluster cleanly. The nodes depart on their own rather than being detected as failed, which avoids the evade flag that would otherwise exclude them from super-majority calculations on rejoin. The static configuration file (aerospike.conf) retains the originalcluster-name, so no file edits are needed.Terminal window $ asinfo -v 'set-config:context=service;cluster-name=maintenance-temp' -h <node-ip> -
Shut down
asdon each node. If hosts will be rebooted, use ASMT to back up the indexes of each namespace before rebooting and restore them afterward.Terminal window $ sudo systemctl stop aerospike# If rebooting:$ sudo asmt -b -v -z -p <path-to-backup-directory> -n <ns1, ns2, ...># ... reboot and perform maintenance ...$ sudo asmt -r -v -p <path-to-backup-directory> -n <ns1, ns2, ...>Verify: Validate that the cluster reports zero unavailable and zero dead partitions with the rack down. This confirms the remaining racks hold a majority of the roster with at least one replica for every partition.
-
Perform maintenance on the rack’s hosts.
-
Start
asd. On startup, the node reads the originalcluster-namefrom the static configuration and automatically rejoins the main cluster.Terminal window $ sudo systemctl start aerospikeVerify:
info networkshows all nodes have rejoined at the expected cluster size.Terminal window Admin> info network~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2026-04-14 00:48:17 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID| IP| Build|Migrations|~~~~~~~~~~~~~~~~~~Cluster~~~~~~~~~~~~~~~~~~|Client| Uptime| | | | |Size| Key|Integrity| Principal| Conns|node1:3000 | BB94B7AEB45DB52|10.0.3.1:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|01:11:25node2:3000 |*BB9D787F6BAF3D6|10.0.3.2:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|01:11:25node3:3000 | BB989A1BF1D8116|10.0.3.3:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|00:00:14node4:3000 | BB9D48DF5A70CEE|10.0.3.4:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|00:00:14node5:3000 | BB9A63D2E17F490|10.0.3.5:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|01:11:25node6:3000 | BB9E312FA6C9B28|10.0.3.6:3000|E-8.1.1.2| 0.000 | 6|A3F7C912DE4B|True |BB9E312FA6C9B28| 5|01:11:25Number of rows: 6 -
Validate that the cluster has no unavailable or dead partitions and that all nodes have rejoined.
-
Wait for migrations to complete. Use
cluster-stableto check. It returnsERRORwhile migrations are in progress and the cluster key when they are done. Run it periodically until all nodes return the same key:Terminal window Admin+> asinfo -v 'cluster-stable:size=6;ignore-migrations=false'node1:3000 (10.0.3.1) returned:A3F7C912DE4Bnode2:3000 (10.0.3.2) returned:A3F7C912DE4Bnode3:3000 (10.0.3.3) returned:A3F7C912DE4Bnode4:3000 (10.0.3.4) returned:A3F7C912DE4Bnode5:3000 (10.0.3.5) returned:A3F7C912DE4Bnode6:3000 (10.0.3.6) returned:A3F7C912DE4B -
Repeat for the next rack.
Optimization with active-rack
Starting with Database 7.2.0, the active-rack feature can both ensure full availability for two equally sized racks and shorten the procedure. When active-rack is configured, the designated active rack holds all master partitions. The passive rack holds only secondaries. This means:
- The quiesce step can be skipped for the passive rack. Since the passive rack has no masters, there is no master handoff needed, so taking it down does not cause a master gap.
- The active rack remains fully available while the passive rack is down.
With RF=2 and two racks, the active rack has no redundancy while the passive rack is down: every partition has exactly one roster replica (the master on the active rack). If any node in the active rack fails during this window, partitions mastered on that node become unavailable until it returns. Minimize the maintenance window and monitor the active rack closely.
Procedure with active-rack:
-
Designate the rack that will stay up as
active-rack. In SC mode, this requires setting the config, reclustering, re-rostering (withmanage roster stage observed), and reclustering again. See Change active rack for SC dynamically. Wait for migrations to complete (masters move to the active rack).Terminal window Admin+> manage config namespace <ns> param active-rack to <rack-id>Admin+> manage reclusterAdmin+> manage roster stage observed ns <ns>Admin+> manage reclusterThe
manage roster stage observedcommand prompts for interactive confirmation because the active-rack change modifies the roster prefix (for example, from no marker toM1). For scripted or non-interactive use, set the roster directly withroster-set, including theM<rack-id>prefix that appears in the Observed Nodes list:Terminal window $ asinfo -v 'roster-set:namespace=<ns>;nodes=M<rack-id>|<node1>@<rack>,<node2>@<rack>,...'$ asadm ... -e "manage recluster"Verify:
asinfo -v 'cluster-stable:size=N;ignore-migrations=false'returns the same cluster key on all nodes (migrations complete). Useshow pmapto confirm all Primary partitions are on the active rack’s nodes. -
Recommended: Dynamically change
cluster-nameon the passive rack’s nodes to eject them from the cluster. The static config retains the originalcluster-name. See the tip above about when this step is essential.Terminal window $ asinfo -v 'set-config:context=service;cluster-name=maintenance-temp' -h <node-ip> -
Shut down
asd. If hosts will be rebooted, use ASMT to back up and later restore the indexes of each namespace.Terminal window $ sudo systemctl stop aerospike# If rebooting:$ sudo asmt -b -v -z -p <path-to-backup-directory> -n <ns1, ns2, ...># ... reboot and perform maintenance ...$ sudo asmt -r -v -p <path-to-backup-directory> -n <ns1, ns2, ...> -
Perform maintenance on the passive rack’s hosts.
-
Start
asd. The node reads the originalcluster-namefrom static config and rejoins automatically.Terminal window $ sudo systemctl start aerospikeVerify:
info networkshows all nodes have rejoined at the expected cluster size.Terminal window Admin> info network~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2026-04-14 01:12:45 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Node| Node ID| IP| Build|Migrations|~~~~~~~~~~~~~~~~~~Cluster~~~~~~~~~~~~~~~~~~|Client| Uptime| | | | |Size| Key|Integrity| Principal| Conns|node1:3000 | BB94B7AEB45DB52|10.0.3.1:3000|E-8.1.1.2| 0.000 | 4|D74E19A3B82C|True |BB9D48DF5A70CEE| 5|01:35:53node2:3000 |*BB9D787F6BAF3D6|10.0.3.2:3000|E-8.1.1.2| 0.000 | 4|D74E19A3B82C|True |BB9D48DF5A70CEE| 5|01:35:53node3:3000 | BB989A1BF1D8116|10.0.3.3:3000|E-8.1.1.2| 0.000 | 4|D74E19A3B82C|True |BB9D48DF5A70CEE| 5|00:00:14node4:3000 | BB9D48DF5A70CEE|10.0.3.4:3000|E-8.1.1.2| 0.000 | 4|D74E19A3B82C|True |BB9D48DF5A70CEE| 5|00:00:14Number of rows: 4 -
Validate that the cluster has no unavailable or dead partitions and that all nodes have rejoined.
-
Wait for migrations to complete before switching
active-rack. Usecluster-stableto check. It returnsERRORwhile migrations are in progress and the cluster key when they are done. Run it periodically until all nodes return the same key:Terminal window Admin+> asinfo -v 'cluster-stable:size=4;ignore-migrations=false'node1:3000 (10.0.3.1) returned:D74E19A3B82Cnode2:3000 (10.0.3.2) returned:D74E19A3B82Cnode3:3000 (10.0.3.3) returned:D74E19A3B82Cnode4:3000 (10.0.3.4) returned:D74E19A3B82C -
Switch
active-rackto point to the now-maintained rack (repeat the set config / recluster / re-roster / recluster sequence). Wait for migrations to complete. -
Repeat steps 2-8 for the other rack.
-
After both racks are maintained, disable
active-rackto restore normal balanced partition distribution.Terminal window Admin+> manage config namespace <ns> param active-rack to 0Admin+> manage reclusterAdmin+> manage roster stage observed ns <ns>Admin+> manage reclusterVerify:
show pmapshows Primary and Secondary partitions evenly distributed across all nodes.
Validate partitions
When you validate partitions, each node reports the global number of dead or unavailable partitions. For example, if the entire cluster has determined that 100 partitions are unavailable, all of the current nodes report 100 unavailable partitions.
Use show pmap to display the partition map. The Unavailable and Dead columns should be 0 for each node.
Admin> show pmap~~~~~~~~~~~~~~~~~~~Partition Map Analysis~~~~~~~~~~~~~~~~~~Namespace| Node| Cluster Key|~~~~~~~~~~~~Partitions~~~~~~~~~~~~ | | |Primary|Secondary|Unavailable|Deadtest |node1:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node2:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node3:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node4:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test | | | 4096| 4096| 0| 0Number of rows: 4Revive dead partitions
You may wish to use your namespace in spite of potentially missing data. For example, you may have entered a maintenance state where you have disabled application use, and are preparing to reapply data from a reliable message queue or other source.
-
Identify dead partitions with
show pmap.Terminal window Admin> show pmap~~~~~~~~~~~~~~~~~~~Partition Map Analysis~~~~~~~~~~~~~~~~~~Namespace| Node| Cluster Key|~~~~~~~~~~~~Partitions~~~~~~~~~~~~| | |Primary|Secondary|Unavailable|Deadtest |node1:3000|A1D7BBA0D9EF| 915| 915| 0| 264test |node2:3000|A1D7BBA0D9EF| 915| 915| 0| 264test |node3:3000|A1D7BBA0D9EF| 915| 915| 0| 264test |node4:3000|A1D7BBA0D9EF| 915| 915| 0| 264test | | | 3660| 3660| 0| 264Number of rows: 4 -
Run
reviveto acknowledge the potential data loss on each server.Terminal window Admin+> manage revive ns test~~~Revive Namespace Partitions~~~Node|Responsenode1:3000 |oknode2:3000 |oknode3:3000 |oknode4:3000 |okNumber of rows: 4 -
Run
reclusterto revive the dead partitions.Terminal window Admin+> manage reclusterSuccessfully started recluster -
Verify that there are no longer any dead partitions with
show pmap.Terminal window Admin> show pmap~~~~~~~~~~~~~~~~~~~Partition Map Analysis~~~~~~~~~~~~~~~~~~Namespace| Node| Cluster Key|~~~~~~~~~~~~Partitions~~~~~~~~~~~~| | |Primary|Secondary|Unavailable|Deadtest |node1:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node2:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node3:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test |node4:3000|A1D7BBA0D9EF| 1024| 1024| 0| 0test | | | 4096| 4096| 0| 0Number of rows: 4