# Best practices for Aerospike and Linux

This page describes stability and performance best practices for Aerospike and the Linux operating system.

## Overview

When Aerospike Database starts, it verifies certain best practices and logs a warning for each violation it finds.

## Best practices checks

-   For production environments, set [`enforce-best-practices`](https://aerospike.com/docs/database/reference/config#service__enforce-best-practices) to `true` so that the server shuts down if any best practices are violated during startup.

::: note
This parameter is `false` by default, however not following the best practices on this page can lead to degraded performance and crashes on some systems.
:::

-   When `enforce-best-practices` is set to `false`, you can still monitor violations with the [`failed_best_practices`](https://aerospike.com/docs/database/reference/metrics#node_stats__failed_best_practices) Boolean statistic, or the [`best-practices`](https://aerospike.com/docs/database/reference/info#best-practices) info command.
    
-   The `failed_best_practices` statistic reports `true` if any best practices are violated during startup. The `best-practices` info command returns the list of best practices that failed.
    

### Best practices checked at startup

-   [Aerospike database best practices](#aerospike-database-best-practices)
    -   [service-threads](#service-threads)
    -   [indexes-memory-budget](#indexes-memory-budget)
    -   [Namespace device size](#namespace-device-size)
-   [Linux best practices](#linux-best-practices)
    -   [All-Flash deployment](#all-flash-deployment)
    -   [RAM reserved for Linux operating system resources](#ram-reserved-for-linux-operating-system-resources)
    -   [min\_free\_kbytes](#min_free_kbytes)
    -   [swappiness](#swappiness)
    -   [THP - Transparent Huge Pages](#thp---transparent-huge-pages)
    -   [Zone reclaim mode](#zone-reclaim-mode)
    -   [NVMe partitioning](#nvme-partitioning)
    -   [vm.max\_map\_count](#vmmax_map_count)
    -   [Containers - networks](#containers---networks)
    -   [Maximum open file limits](#maximum-open-file-limits)
        -   [Non-systemd](#non-systemd)
        -   [systemd](#systemd)
    -   [somaxconn](#somaxconn)
    -   [rmem-max](#rmem-max)
    -   [wmem-max](#wmem-max)
    -   [shmall](#shmall)
    -   [shmmax](#shmmax)

## Aerospike Database best practices

The best practices listed in this section are specific to the Aerospike Database.

### service-threads

The recommended value depends on the configuration of the namespaces in the `aerospike.conf` file.

The [`service-threads`](https://aerospike.com/docs/database/reference/config#service__service-threads) best practice is checked at server startup.

-   We suggest and default to 5 per CPU/vCPU in the following configuration. If any namespace has:
    -   [`storage-engine`](https://aerospike.com/docs/database/reference/config#namespace__storage-engine) set to `device` and [`data-in-memory`](https://aerospike.com/docs/database/reference/config#namespace__data-in-memory) is set to `false` or
    -   `data-in-memory` is `false` and [`commit-to-device`](https://aerospike.com/docs/database/reference/config#namespace__commit-to-device) is `true`
    -   then the recommended value for `service-threads` is at least 3 per CPU/vCPU.
-   Otherwise:
    -   `storage-engine` is either set to `pmem` or `memory` or
        -   `storage-engine` is `device` with `data-in-memory` set to `true` and
        -   `commit-to-device` set to `false`
        -   then the recommended and suggested value for `service-threads` is at least 1 per CPU/vCPU which is also the default for such configurations.

### indexes-memory-budget

We recommend that the cumulative sum of the `memory-size` configuration not exceed the total memory on the machine.

The [`indexes-memory-budget`](https://aerospike.com/docs/database/reference/config#namespace__indexes-memory-budget) best practice is checked at server startup.

::: note
[`memory-size`](https://aerospike.com/docs/database/reference/config#namespace__memory-size) was deprecated in Database 7.0.0. For more information, see [Aerospike Database 7.0.0 Release Notes](https://aerospike.com/docs/database/release/7-0).
:::

### Namespace device size

All the namespace storage devices should be the same size within an 8 MiB range of tolerance.

The namespace device size best practice is checked at server startup.

### Initialize new cluster nodes with SMD

When adding a new node to an existing cluster, it is a best practice to initialize it by copying the cluster’s shared metadata (SMD) files from an active cluster node.

See [Directory structure - Run time directories](https://aerospike.com/docs/database/8.0.0/manage/database/directory-structure/#run-time-directories) for more information about the SMD directory.

## Linux best practices

The best practices listed in this section are specific to the Linux operating system.

### All-Flash deployment

In an All-Flash deployment, the following kernel parameters are required. `enforce-best-practices` verifies that these kernel parameters are at least expected values. The All-Flash deployment best practice is checked at server startup.

Terminal window

```bash
/proc/sys/vm/dirty_bytes = 16777216

/proc/sys/vm/dirty_background_bytes = 1

/proc/sys/vm/dirty_expire_centisecs = 1

/proc/sys/vm/dirty_writeback_centisecs = 10
```

-   When running as non-root, you must set these values before running the Aerospike server.
-   When running as root, the server configures them automatically.

Either way, the node will not start if these parameters can’t be correctly set manually or automatically by the server.

### RAM reserved for Linux operating system resources

To help prevent out-of-memory issues with host hardware, keep 10-15% of total physical memory reserved for Linux system resources. The RAM reserved for Linux operating system resources best practice is checked at server startup.

The following may influence memory usage:

-   Overhead from the Linux OS and services.
-   Overhead caused by memory fragmentation.
-   Overhead from Aerospike [primary index](https://aerospike.com/docs/database/8.0.0/manage/namespace/primary-index/) and [secondary indexes](https://aerospike.com/docs/database/8.0.0/manage/namespace/primary-index/).
-   Namespace data for [in-memory namespaces](https://aerospike.com/docs/database/8.0.0/manage/namespace). For more information, see [Capacity planning](https://aerospike.com/docs/database/8.0.0/manage/planning/capacity).
-   Overhead from cache and queue-related configurations, including `max-write-cache` (per device) and `post-write-cache` (per device). See [Block size and cache size](https://aerospike.com/docs/database/8.0.0/learn/architecture/data-storage/resilience) for more information.
-   Overhead from the Aerospike process.

### min\_free\_kbytes

The[`min_free_kbytes`](https://www.kernel.org/doc/Documentation/sysctl/vm.txt) kernel parameter controls how much memory to keep free from filesystem caches. Normally, the kernel occupies almost all free RAM with filesystem caches and frees up memory for allocation by processes as required. As Aerospike performs large allocations in shared memory (1GB chunks), the default kernel value may result in an unexpected OOM (out-of-memory kill).

The `min_free_kbytes` best practice is checked at server startup.

We recommend that you configure the parameter to a minimum of 1.1GB, preferably 1.25GB if using cloud vendor drivers as these can make large allocations. This ensures that Linux always keeps enough memory available and free for large allocations.

::: tip
If `min_free_kbytes` is set too high, it will likely cause an out-of-memory error in Aerospike.
:::

1.  Check the parameter value.
    
    ```plaintext
    cat /proc/sys/vm/min_free_kbytes
    ```
    
2.  If the value is lower, adjust it accordingly to the running kernel and persist across reboots.
    
    ```plaintext
    echo 3 > /proc/sys/vm/drop_caches
    
    echo 1310720 > /proc/sys/vm/min_free_kbytes
    
    echo "vm.min_free_kbytes=1310720" >> /etc/sysctl.conf
    ```
    

### Swappiness

For low-latency operations, using swap to any extent drastically slows down performance. We recommend that you disable swap with `swapoff -a` and remove the swap partition from `/etc/fstab`.

If that’s not possible for operational reasons, set the swappiness to 0:

```plaintext
echo 0 > /proc/sys/vm/swappiness

echo "vm.swappiness=0" >> /etc/sysctl.conf
```

The swappiness best practice is checked at server startup.

### THP - Transparent Huge Pages

The best practices startup check permits `thp-enabled` and `thp-defrag` to be set to either `madvise` or `never`.

Aerospike _strongly_ recommends disabling Transparent Huge Pages (THP) before the Aerospike service starts. The Linux kernel uses THP to improve overall system responsiveness and allocation speed, however it can be counterproductive for high-throughput and low-latency databases when they perform multiple small allocations. THP can cause the system to run out of RAM, with symptoms similar to a memory leak. THP also causes latency when the defragmentation page locks.

### Zone reclaim mode

For NUMA architectures,`zone_reclaim_mode` causes aggressive reclaims and memory scans when enabled.

We recommend that you disable `zone_reclaim_mode` by setting `/proc/sys/vm/zone_reclaim_mode` to `0`.

The [`zone_reclaim_mode`](https://www.kernel.org/doc/Documentation/sysctl/vm.txt) best practice is checked at server startup.

### NVMe partitioning

NVMe devices are normally capable of 4 simultaneous I/O operations.

Due to their connection design, these operations occupy 4 PCIe I/O lanes. On raw devices, Aerospike suggests that you partition each NVMe device to at least 4 partitions to allow 4 write threads to operate in Aerospike and greatly improves the disk speed.

If using a single partition with Aerospike as raw device, `iostat` may show 100% disk utilization (%util), while the `await` operation queuing statistic may be showing no queueing (await <1 means no queueing is happening). This indicates that the disk itself can do more, while the PCIe lanes that are used are already saturated.

The NVMe partitioning best practice is checked at server startup.

See [Partition your flash devices](https://aerospike.com/docs/database/8.0.0/manage/planning/ssd/setup/#partition-your-flash-devices) for details on device partitioning.

### vm.max\_map\_count

If you use Kubernetes or Docker, we recommend that you raise the `max_map_count` parameter, which controls the maximum number of memory map commands that can be performed by a process. If `max_map_count` is too low, it may result in memory allocation issues during normal operation.

To change this parameter:

```plaintext
echo "vm.max_map_count=262144" >> /etc/sysctl.conf

echo 262144 > /proc/sys/vm/max_map_count
```

The `vm.max_map_count` best practice is checked at server startup.

::: note
After modifying `max_map_count` may need to restart the Docker daemon and all its containers for the changes to take effect.
:::

### Containers - networks

When using Kubernetes or Docker, the default behavior is to use `EXPOSE` and `PUBLISH` features to publish ports from a container through the host to the outside world. This causes the Docker process to monitor a given port on the host and forward all packets to the container itself. This is highly inefficient and may cause latencies, packet drops and other crashes within the containers under heavy loads.

If using containers, we recommend that you configure those containers to either:

-   Use bridged networking, rather than Docker-only NAT, or
-   Use iptables to forward packets to the NAT network Aerospike containers, rather than the Docker EXPOSE port feature.
-   If using a Docker container, run it with the `--net=host` flag to inherit /proc/sys/net/core/\*mem\_max files. Without that flag, maximums cannot be modified from within that environment.

The containers - network best practice is checked at server startup.

See the [Docker configuration manuals](https://docs.docker.com/network/bridge/) for details.

### Maximum open file limits

Aerospike clients perform dynamic connections to the database nodes as required. This may result in many active connections. These connections, on a Linux system, hold a file descriptor and are treated as open files.

The Aerospike configuration parameter [`proto-fd-max`](https://aerospike.com/docs/database/reference/config#service__proto-fd-max) specifies the maximum number of allowed client connections. The Aerospike server does not start if `proto-fd-max` is higher than the Linux system’s maximum open files configuration for the process.

After installing Aerospike, verify that the maximum open files for the `asd` process is configured to have a higher maximum open file value than `proto-fd-max` to allow for fabric and heartbeat connections as well as any open files.

-   Mesh heartbeat and fabric should run on the same NIC.

The maximum open file limits best practice is checked at server startup.

#### Non-systemd

Edit `/etc/init.d/aerospike.conf` and change the value of the following line.

```plaintext
ulimit -n 100000
```

The non-systemd best practice is checked at server startup.

#### systemd

The systemd best practices is checked at server startup.

1.  Create an `override.conf` file to control this.
    
    ```plaintext
    cat <<EOF > /etc/systemd/system/aerospike.service.d/override.conf
    
    [Service]
    
    LimitNOFILE=<MAX NUMBER OF FILE DESCRIPTORS>
    
    EOF
    ```
    
2.  Reload the systemd daemon.
    
    ```plaintext
    systemctl daemon-reload
    ```
    
3.  Restart the Aerospike server to apply the new value.
    
4.  (Optional) You can apply this change dynamically to the `asd` process if `prlimit` is available:
    
    ```plaintext
    prlimit --pid $(pgrep asd) --nofile=200000
    ```
    

### somaxconn

Limit of socket listen() backlog, known in userspace as SOMAXCONN. Defaults to 4096. (Prior to Linux kernel 5.4.0, the default was 128. See `tcp_max_syn_backlog` for additional tuning for TCP sockets.

```plaintext
echo 4096 > /proc/sys/net/core/somaxconn
```

The `somaxconn` best practice is checked at startup.

### rmem-max

The maximum receive socket buffer size in bytes.

```plaintext
echo 15728640 > /proc/sys/net/core/rmem_max
```

The `rmem-max` best practice is checked at startup only in EE.

### wmem-max

The maximum send socket buffer size in bytes.

```plaintext
echo 5242880 > /proc/sys/net/core/wmem_max
```

The `wmem-max` best practice is checked at startup only in EE.

### shmall

The sum of all shared memory segments on the whole system.

The `shmmall` best practice is checked at startup only in EE.

### shmmax

The maximum size of a single shared memory segment.

The `shmmax` best practice is checked at startup only in EE.