Best practices for Aerospike and Linux
This page describes stability and performance best practices for Aerospike and the Linux operating system.
Overviewโ
When the Aerospike Database starts it verifies certain best practices and logs a warning for each violation it finds.
For production environments, set
enforce-best-practices
totrue
so that the server shuts down if any best practices are violated during startup.When
enforce-best-practices
is set tofalse
, you can still monitor violations with thefailed_best_practices
Boolean statistic, or thebest-practices
info command.The
failed_best_practices
statistic reportstrue
if any best practices are violated during startup. Thebest-practices
info command returns the list of best practices that failed.
Best practices checked at startupโ
The following list of best practices are checked at startup:
Aerospike database best practicesโ
service-threadsโ
The service-threads
best practice is checked at server startup.
The recommended value depends on the configuration of the namespaces in the aerospike.conf
file:
- We suggest and default to 5 per CPU/vCPU in the following configuration. If any namespace has:
storage-engine
set todevice
anddata-in-memory
is set tofalse
ordata-in-memory
isfalse
andcommit-to-device
istrue
- then the recommended value for
service-threads
is at least 3 per CPU/vCPU.
- Otherwise:
storage-engine
is either set topmem
ormemory
orstorage-engine
isdevice
withdata-in-memory
set totrue
andcommit-to-device
set tofalse
- then the recommended and suggested value for
service-threads
is at least 1 per CPU/vCPU which is also the default for such configurations.
indexes-memory-budgetโ
The indexes-memory-budget
best practice is checked at server startup.
memory-size
is deprecated in Database 7.0. For more information, see Aerospike Database 7.0 Release Notes.
We recommend that the cumulative sum of the memory-size
configuration not exceed the total memory on the machine.
Namespace device sizeโ
All the namespace storage devices should be the same size, within an 8 MiB range of tolerance. This best practice is checked at server startup.
Linux best practicesโ
All-Flash deploymentโ
In an All-Flash deployment, the following kernel parameters are required.
enforce-best-practices
verifies that these kernel parameters are at least expected values.
/proc/sys/vm/dirty_bytes = 16777216
/proc/sys/vm/dirty_background_bytes = 1
/proc/sys/vm/dirty_expire_centisecs = 1
/proc/sys/vm/dirty_writeback_centisecs = 10
- When running as non-root, you must set these values before running the Aerospike server.
- When running as root, the server configures them automatically.
Either way, if these parameters can't be correctly set manually or automatically by the server, the node will not start.
RAM reserved for Linux operating system resourcesโ
To help prevent out-of-memory issues with host hardware, keep 10-15% of total physical memory reserved for Linux system resources.
The following may influence memory usage:
- Overhead from the Linux OS and services.
- Overhead caused by memory fragmentation.
- Overhead from Aerospike indexes (primary & secondary).
- Namespace data for in-memory namespaces. For more information, see Capacity planning.
- Overhead from cache and queue-related configurations, including
max-write-cache
(per device) andpost-write-cache
(per device). See Block size and cache size for more information. - Overhead from the Aerospike process.
min_free_kbytesโ
The min_free_kbytes
best practice is checked at server startup.
The min_free_kbytes
kernel parameter controls how much memory to keep free from filesystem caches.
Normally, the kernel occupies almost all free RAM with
filesystem caches and frees up memory for allocation by processes as required. As
Aerospike performs large allocations in shared memory (1GB chunks), the default
kernel value may result in an unexpected OOM (out-of-memory kill).
We recommend that you configure the parameter to a minimum of 1.1GB, preferably 1.25GB if using cloud vendor drivers as these can make large allocations. This ensures that Linux always keeps enough memory available and free for large allocations.
If min_free_kbytes
is set too high, it is likely to cause an out-of-memory error in Aerospike.
Check the parameter value.
cat /proc/sys/vm/min_free_kbytes
If the value is lower, adjust it accordingly to the running kernel and persist across reboots.
echo 3 > /proc/sys/vm/drop_caches
echo 1310720 > /proc/sys/vm/min_free_kbytes
echo "vm.min_free_kbytes=1310720" >> /etc/sysctl.conf
swappinessโ
The swappiness
best practice is checked at server startup.
For low-latency operations, using swap to any extent drastically slows down
performance. We recommend that you disable swap with swapoff -a
and remove the
swap partition from /etc/fstab
.
If that's not possible for operational reasons, set the swappiness to 0:
echo 0 > /proc/sys/vm/swappiness
echo "vm.swappiness=0" >> /etc/sysctl.conf
THP - Transparent Huge Pagesโ
The best practices startup check permits thp-enabled
and thp-defrag
to be set to
either madvise
or never
.
Aerospike recommends disabling Transparent Huge Pages (THP) before the Aerospike service starts. While the Linux kernel uses THP to improve overall system responsiveness and allocation speed, it can be counterproductive for high-throughput and low-latency databases, , which perform multiple small allocations. THP can cause the system to run out of RAM, with similar symptoms to a memory leak. Another issue is latency caused by THP defragmentation page locking.
Zone reclaim modeโ
The zone_reclaim_mode
best
practice is checked at server startup.
For NUMA architectures,zone_reclaim_mode
causes aggressive reclaims and memory scans when enabled.
We recommend that you disable zone_reclaim_mode
by setting /proc/sys/vm/zone_reclaim_mode
to 0
.
NVMe partitioningโ
NVMe devices are normally capable of 4 simultaneous I/O operations. Due to their connection design, these occupy 4 PCIe I/O lanes. On raw devices, Aerospike suggests that you partition each NVMe device used to at least 4 partitions. This allows 4 write threads to operate in Aerospike and greatly improves the disk speed.
If using a single partition with Aerospike as raw device, iostat
may show 100% disk utilization (%util),
while the await
operation queuing statistic may be showing no queueing (await
<1 means no queueing is happening). This indicates that the disk itself can do
more, while the PCIe lanes that are used are already saturated.
See Partition your flash devices for details on device partitioning.
vm.max_map_countโ
If you use Kubernetes or Docker, we recommend that you raise the max_map_count
parameter, which controls the maximum number of memory map operations that can be
performed by a process. If max_map_count
is low, it may result in memory
allocation issues during normal operation.
To change this parameter:
echo "vm.max_map_count=262144" >> /etc/sysctl.conf
echo 262144 > /proc/sys/vm/max_map_count
You may need to restart the Docker daemon and all its containers
for the changes to take effect after modifying max_map_count
.
Containers - networksโ
When using Kubernetes or Docker, the default behavior is to use EXPOSE
and
PUBLISH
features to publish ports from a container through the host to the
outside world. This causes the Docker process to listen on a given port on
the host and forward all packets to the container itself. This is highly
inefficient and may cause latencies, packet drops and other crashes within the
containers under heavy loads.
If using containers, it is advisable to configure those containers to either:
- Use bridged networking, rather than Docker-only NAT.
- Use iptables to forward packets to the NAT network Aerospike containers, rather than the Docker EXPOSE port feature.
- If using a Docker container, run it with the
--net=host
flag to inherit /proc/sys/net/core/*mem_max files. Without that flag, maximums cannot be modifed from within that environment.
See the Docker configuration manuals for details.
Maximum open file limitsโ
Aerospike clients perform dynamic connections to the database nodes as required. This may result in many active connections. These connections, on a Linux system, hold a file descriptor and are treated as open files.
The Aerospike configuration parameter
proto-fd-max
specifies the maximum number of allowed client connections. The Aerospike server does
not start if proto-fd-max
is higher than the Linux system's maximum open files
configuration for the process.
After installing Aerospike, verify that the maximum open files for the asd
process
is configured to have a higher maximum open file value than proto-fd-max
to
allow for fabric and heartbeat connections as well as any open files.
Non-systemdโ
Edit /etc/init.d/aerospike.conf
and change the value of the following
line.
ulimit -n 100000
systemdโ
Create an
override.conf
file to control this.cat <<EOF > /etc/systemd/system/aerospike.service.d/override.conf
[Service]
LimitNOFILE=<MAX NUMBER OF FILE DESCRIPTORS>
EOFReload the systemd daemon.
systemctl daemon-reload
Restart the Aerospike server to apply the new value.
(Optional) You can apply this change dynamically to the
asd
process ifprlimit
is available:prlimit --pid $(pgrep asd) --nofile=200000
somaxconnโ
Limit of socket listen() backlog, known in userspace as SOMAXCONN. Defaults to 4096. (Was 128 before Linux kernel 5.4) See also tcp_max_syn_backlog
for additional tuning for TCP sockets.
echo 4096 > /proc/sys/net/core/somaxconn
rmem-maxโ
The maximum receive socket buffer size in bytes. Checked at startup in EE only.
echo 15728640 > /proc/sys/net/core/rmem_max
wmem-maxโ
The maximum send socket buffer size in bytes. Checked at startup in EE only.
echo 5242880 > /proc/sys/net/core/wmem_max
shmallโ
The sum of all shared memory segments on the whole system. Checked at startup in EE only.
shmmaxโ
The maximum size of a single shared memory segment. Checked at startup in EE only.