Cluster problems
Migration
When recovering from a cluster integrity fault because of a cluster size change, data migration occurs on the affected nodes. Migration threads take lower priority than the others, and can take a long time to complete depending on the environment and configuration.
See Tuning migration for more information.
Cluster Size Change
If you suspect that a node may be down, you can check in the following ways:
-
Use the asadm tool with the
info
command. -
Check the server log and look for “CLUSTER SIZE = N”. This will tell you how many nodes that particular node thinks is in the cluster. At times there may not be agreement. This can be true if a set of nodes has split from the main cluster. This is sometimes referred to as a split brain. Generally, it is easiest to restart the “lost” node(s) to get it/them to rejoin the cluster.
Iptable Configuration
Customers may need to disable or re-configure IP table to ensure inter-node communications is not blocked.
IPTables interfere with the normal operation of Aerospike. We recommend that you turn it off.
To Turn OFF IPTables run one of the following commands:
sudo /usr/sbin/ntsysvuncheck iptables so that service won't be automatically startedclick Ok
or use chkconfig
sudo /sbin/chkconfig iptables off
After you disable IPTables, turn it off by issuing the command:
sudo /etc/init.d/iptables stop
The Cluster Appears Healthy but the Applications/Clients are Reporting Errors
If you stop seeing writes go to the cluster (as shown by the server logs) and the client is reporting errors/not fully happy but there are no errors in the server log, you can check for these possible causes:
- Check service iptables status – to see if any port is blocked. Even in cases where the service had been up, there have been cases where the port was reset to be blocked.
- Check clients/servers’ CPU load average. One of the causes may be a runaway process on the client or server. If one of the CPUs on the server is showing 100%, but the others are not, this may be normal behavior defrag or another process is running. This can be most commonly done using the Linux
top
command. - Check to make sure heartbeat_received (foreign counter) is increasing/changing (and not stuck) using the
asadm
tool and issue theshow statistics like heartbeat
command. The node expects heartbeats from the other nodes and the foreign counter indicates how many heartbeats have being received from other nodes – the count should be constantly increasing.
If you are using mesh, heartbeats received for self are always 0 (this is expected). In multicast, heartbeats received for self should be increasing/changing.
- Grep for the trans_in_progress keyword on the server log with this command:
tail -200f /var/log/aerospike/aerospike.log |grep 'in-progress'
Sample output:
Oct 03 2022 18:31:49 GMT: INFO (info): (ticker.c:303) in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0Oct 03 2022 18:31:59 GMT: INFO (info): (ticker.c:303) in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0Oct 03 2022 18:32:09 GMT: INFO (info): (ticker.c:303) in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0Oct 03 2022 18:32:19 GMT: INFO (info): (ticker.c:303) in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0
- If the read-write counter (“rw-hash”) is not going down, it means requests are getting backed up. Check for growth pattern.
- If proxy counter (“prox-hash”) is not zero, it means port is blocked for incoming requests. Check the fabric port 3001 (service iptables status)
- If any of the queues are constantly increasing, it may hint of some potential issues on the node unable to keep up.
Validate Multicast
If you want to use multicast, you can validate that multicast is properly enabled on your network, or a particular multicast address is being correctly routed between particular machines.
These tests will only test for the initial multicast connectivity. There are many reasons that your network equipment may initially allow multicast, but may halt it at a later time.
The best ways to ensure multicast runs smoothly are:
-
Turn off storm control to the nodes. Often the connections on the cluster will look like a denial of service attack, so storm control may stop these connections. Check
dmesg
or/var/log/messages
to identify if something specific is showing up as the cause of the error. -
To ensure group membership, do one of the following:
- Turn off IGMP snooping on the cluster vlan.
- Turn on IGMP snooping AND turn on the querier (create a multicast router).
-
Identify if any hops occur in the packets sent between the nodes in the cluster. Use the
traceroute
command to verify. If you identify multiple hops between nodes, you can attempt to increase the multicast-ttl configuration to 2 or 3 on Aerospike configuration as a workaround. -
We recommend using a very simple multicast test tool called
open-mtools
, made available by Google Code without restriction. Download and run the mtools (Note: mtools will not work properly in VM environments).
Install mtools
on both the sending and receiving machines:
$ wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/open-mtools/mtools.1.0.zip$ unzip mtools.1.0.zip$ cd mtools$ gcc mdump.c -o mdump; gcc msend.c -o msend
Run the following mdump
on the receiving machines first:
$./mdump -Q2 -v 239.1.1.1 2345 <interface ip>
Run the msend
command on a machine that will be broadcasting:
$ ./msend -m 100 -b 300 -n 600 -s 10 239.1.1.1 2345 <ttl> <interface IP>Equiv cmd line: msend -b300 -m100 -n600 -p1000 -s10 -S65536 239.1.1.1 2345Sending 600 bursts of 300 100-byte messagesSending burst of 300 msgsSending burst of 300 msgsSending burst of 300 msgsSending burst of 300 msgsSending burst of 300 msgs...Pausing before sending 'statSending stat180000 messages sent (not including 'stat')
This will send 600 bursts of 300 100-byte messages on address 239.1.1.1 on port 2345. Aerospike is currently configured for a different address, but this test succeeding will show that multicast functionality is working.
See if a machine is receiving:
$ ./mdump -Q2 -s -v 239.1.1.1 2345Equiv cmd line: mdump -p0 -Q2 -r4194304 -v 239.1.1.1 2345WARNING: tried to set SO_RCVBUF to 4194304, only got 262142echo sender equiv cmd: msend -b300 -m100 -n5 -p1000 -s10 -S65536 239.1.1.1 2345180000 msgs sent, 180000 received (not including 'stat')0.000000% loss
The normal test scenario would be to run msend from one machine and mdump on the rest of machines. You can change the parameter of -b
to send more packets in one burst. mdump
will print summary as follows which tells what is the %loss.
Verify multicast by testing sending and receiving on all nodes.
A reasonable validation scenario for four servers might be as follows:
- Run with node 01 as sender and nodes 02, 03, 04 as receivers and verified that the bytes are received fine.
- Repeating the test thrice each time using a different node as sender and the rest of the nodes as receivers and verified success.
We recommend performing these tests when a new cluster is set up, when a new machine is added, when a server moves within the datacenter, or when the network configuration changes.