Add a node to an Aerospike cluster

Overview

This section describes how to add capacity to an Aerospike cluster by adding a node. You can add single or multiple nodes to a running cluster. The number of nodes you can add is not based on the number of existing nodes. Clusters configured with rack awareness typically add the same number of nodes in each group.

Method

If configured correctly, the new node will join the cluster automatically upon starting up the Aerospike daemon. This document is then mostly covering how to configure the new node.

Important points to consider when adding a new node to a cluster

The following must be enforced to ensure the node will properly join the cluster:

For both in-memory and persistent namespaces, the total memory / disk space allocated for each namespace on each node should be the same (other than transient state when changing capacity).
If the namespace is persistent the individual devices allocated to the namespace have the same capacity.
Refer to the Namespace storage configuration page for more information on configuring a namespace.

Important when adding empty nodes: For server version on the older paxos and heartbeat protocol (prior to 3.13), it may be preferable to wait for migrations to complete between adding empty nodes. The older cluster protocol would drop redundant partition copies prior to completing the migration for the partition, leaving some partitions (the one that would be owned by the newly added node) with 1 less copy than dictated by the replication factor. This adds risks of data unavailability in the unlikely event of a node leaving the cluster prior to migrations completing. The more new nodes added at the same time, the more partitions would be losing a copy putting more data at risk in case of a node leaving the cluster. Replica copies of a partition are kept around until the migration of the partition completes.

Configuration setup

Mesh mode

See configure Aerospike database for a detailed description of various sections of the configuration file and their relevant configuration parameters.
The following steps assume the following:
- A cluster(at least 1 node) is already setup in mesh mode
- The Aerospike daemon is installed on the new node. PSee Install Aerospike for the Aerospike installation procedure.
The example used to illustrate this procedure consists of a 2 node cluster to which a 3rd node is added.

Example: 2 node cluster

Admin> info
                         Node               Node                Ip        Build   Cluster            Cluster     Cluster         Principal   Client     Uptime
                            .                 Id                 .            .      Size                Key   Integrity                 .    Conns          .
10.0.0.101:3000                 BB90A09E3270008    10.0.0.101:3000   E-3.11.0.2         2   F0322B636B7B0FE9   True        BB9AF1F8D270008        5   09:26:10
10.0.0.103:3000                 BB9235677270008    10.0.0.103:3000   E-3.11.0.2         2   F0322B636B7B0FE9   True        BB9AF1F8D270008       10   09:26:14
Number of rows: 2

Configuration setup for node 10.0.0.101 which is already part of the cluster

# Aerospike database configuration file for deployments using mesh heartbeats.
service {
        user root
        group root
        pidfile /var/run/aerospike/asd.pid
        service-threads 20 # Should be 5 times number of vCPUs for 4.7+ and
                           # at least one SSD namespace, otherwise number of vCPUs
        proto-fd-max 15000
        node-id-interface eth1
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
  service {
    address any
    access-address 10.0.0.101
    port 3000
  }

  heartbeat {
    mode mesh
    address 10.0.0.101
    port 3002 # Heartbeat port for this node.

    # List one or more other nodes, one ip-address & port per line:
    # Please note that we do not have the address of the incoming node in this list
    mesh-seed-address-port 10.0.0.103  3002
    mesh-seed-address-port 10.0.0.101  3002
    # Having the node itself as a mesh seed node is allowed
    # and helps with consistent configuration files across the cluster

    interval 250
    timeout 10
  }

  fabric {
    port 3001
  }

  info {
    port 3003
  }
}

namespace test {
                                # Data in memory without persistence namespace
  replication-factor 2
  memory-size 4G
  storage-engine memory
}

namespace bar {
    memory-size 4G            # Maximum memory allocation for primary and secondary indexes.

    # Warning - legacy data in defined raw partition devices will be erased.
    # These partitions must not be mounted by the file system.

    storage-engine device {       # Configure the storage-engine to use persistence
                            # Add raw device(s). Maximum size is 2 TiB.
        device /dev/sdb1
        device /dev/sdc1
        device /dev/sdd1
        device /dev/sde1

        # The 2 lines below optimize for SSD.
  scheduler-mode noop
  write-block-size 128K # adjust block size to make it efficient for SSDs.
      }
}

Configuration setup for node 10.0.0.103 which is already part of the cluster

# Aerospike database configuration file for deployments using mesh heartbeats.
service {
        user root
        group root
        pidfile /var/run/aerospike/asd.pid
        service-threads 20 # Should be 5 times number of vCPUs for 4.7+ and
                           # at least one SSD namespace, otherwise number of vCPUs
        proto-fd-max 15000
        node-id-interface eth1
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
    service {
        address any
        access-address 10.0.0.103
        port 3000
    }

    heartbeat {
        mode mesh
        address 10.0.0.103
        port 3002 # Heartbeat port for this node.

        # List one or more other nodes, one ip-address & port per line:
  # Please note that we do not have the address of the incoming node in this list
        mesh-seed-address-port 10.0.0.101  3002
  mesh-seed-address-port 10.0.0.103  3002

        interval 250
        timeout 10
    }

    fabric {
        port 3001
    }

    info {
        port 3003
    }
}

namespace test {
                                  # Data in memory without persistence namespace
    replication-factor 2
    memory-size 4G
    storage-engine memory
}

namespace bar {
    memory-size 4G                # Maximum memory allocation for primary and secondary indexes.

    # Warning - legacy data in defined raw partition devices will be erased.
    # These partitions must not be mounted by the file system.

    storage-engine device {       # Configure the storage-engine to use persistence
                                  # Add raw device(s). Maximum size is 2 TiB
            device /dev/sdb3
            device /dev/sdc3
            device /dev/sdd3
            device /dev/sde3
            # The 2 lines below optimize for SSD.
            scheduler-mode noop
            write-block-size 128K # adjust block size to make it efficient for SSDs.
        }
}

Configuration for the new incoming node with IP 10.0.0.100

# Aerospike database configuration file for deployments using mesh heartbeats.
service {
        user root
        group root
        pidfile /var/run/aerospike/asd.pid
        service-threads 20 # Should be 5 times number of vCPUs for 4.7+ and
                           # at least one SSD namespace, otherwise number of vCPUs
        proto-fd-max 15000
        node-id-interface eth1
}

logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}

network {
    service {
        address any
        access-address 10.0.0.100
        port 3000
    }

    heartbeat {
        mode mesh
        address 10.0.0.100
        port 3002 # Heartbeat port for this node.

        # List one or more other nodes, one ip-address & port per line:
        mesh-seed-address-port 10.0.0.100  3002
        mesh-seed-address-port 10.0.0.101  3002
        mesh-seed-address-port 10.0.0.103  3002

        interval 250
        timeout 10
    }

    fabric {
        port 3001
    }

    info {
        port 3003
    }
}

namespace test {
                                  # Data in memory without persistence namespace
    replication-factor 2
    memory-size 4G
    storage-engine memory
}

namespace bar {
    memory-size 4G                # Maximum memory allocation for primary and secondary indexes.
    # Warning - legacy data in defined raw partition devices will be erased.
  # These partitions must not be mounted by the file system.

  storage-engine device {       # Configure the storage-engine to use persistence
                                  # Add raw device(s). Maximum size is 2 TiB
            device /dev/sdb
            device /dev/sdc
            device /dev/sdd
            device /dev/sde
      # The 2 lines below optimize for SSD.
      scheduler-mode noop
      write-block-size 128K # adjust block size to make it efficient for SSDs.
        }
}

Multicast mode As multicast is another mode of communication for the heartbeat protocol all the other sections of the configuration files remain the same except for the heartbeat section. Please see Network Heartbeat Configuration for more information on the heartbeat section.

So, in our case the heartbeat section of the incoming node should be

...
  heartbeat {
    mode multicast                  # Send heartbeats using Multicast
    multicast-group 239.1.99.2      # multicast address
    port 9918                       # multicast port
    address 10.0.0.100         # (Optional) (Default any) IP of the NIC to
                                    # use to send out heartbeat and bind
                                    # fabric ports
    interval 150                    # Number of milliseconds between heartbeats
    timeout 10                      # Number of heartbeat intervals to wait
                                    # before timing out a node
  }
...

Verify node joined the cluster

To verify the node is now a part of the cluster on any of the nodes we can issue an info command within the asadm tool. For our example the info command outputs the following

Admin> info
                         Node               Node                Ip        Build   Cluster            Cluster     Cluster         Principal   Client     Uptime
                            .                 Id                 .            .      Size                Key   Integrity                 .    Conns          .
10.0.0.101:3000                 BB90A09E3270008    10.0.0.101:3000   E-3.11.0.2         3   F0322B636B7B0FE9   True        BB9AF1F8D270008        5   51:49:48
10.0.0.103:3000                 BB9235677270008    10.0.0.103:3000   E-3.11.0.2         3   F0322B636B7B0FE9   True        BB9AF1F8D270008       10   51:49:36
10.0.0.100:3000           *BB9AF1F8D270008   10.0.0.100:3000   E-3.11.0.2         3   F0322B636B7B0FE9   True        BB9AF1F8D270008       10   03:05:39
Number of rows: 3

We can verify the same by searching the server log at /var/log/aerospike/aerospike.log

grep 'CLUSTER-SIZE' /var/log/aerospike/aerospike.log

You should see (from our example):

Jan 28 2017 01:00:03 GMT: INFO (info): (ticker.c:169) NODE-ID bb9af1f8d270008 CLUSTER-SIZE 3 CLUSTER-NAME myCluster