Skip to main content
Loading

Configure the secondary index

Aerospike's secondary indexes can be stored in three different ways in Enterprise Edition (EE): shared memory (SHMem) by default, persistent memory (PMem), and flash (on NVMe SSDs). Separate namespaces within the same cluster can use different secondary index storage methods.

note

Secondary indexes are stored in volatile process memory in the Community Edition (CE).

The sindex-type configuration parameterโ€‹

To specify a secondary index (sindex) storage method, use the namespace context configuration item sindex-type.

  • The default value is shmem, with the indexes stored in shared memory.
  • To specify a persistent memory sindex, use sindex-type pmem.
  • To specify a sindex in flash, use sindex-type flash.
  • Aerospike CE doesn't support the sindex-type configuration.

For sizing information, see Capacity Planning for Secondary Indexes.

caution

Cautions for systemdโ€‹

In a systemd environment you might need to increase TimeoutSec from the default of 15s. This setting is in /usr/lib/systemd/system/aerospike.service. This prevents systemd from killing the asd process prematurely while the service is being shutdown. The secondary index clean-up process during shutdown might take longer than the default 15s. In Aerospike, this default value has been increased to 10 minutes as of version 4.6.0.2.

In a systemd environment you might need to increase TimeoutSec from the default of 15s. This setting is in /usr/lib/systemd/system/aerospike.service. This prevents systemd from killing the asd process prematurely while the service is shutting down. The secondary index clean-up process during shutdown might take longer than the default 15s. In Aerospike, this default value was increased to 10 minutes as of version 4.6.0.2.

Persistent memory sindexโ€‹

Aerospike's persistent memory sindex feature stores secondary indexes in Intel Optane persistent memory (PMem) instead of the default shared memory segments.

Aerospike requires the persistent memory to be accessible using fsdax, that is, using block devices such as /dev/pmem0:

  • The NVDIMM regions must be configured as AppDirect regions, as in the following example from a machine with a 750-GiB AppDirect region:
sudo ipmctl show -region
SocketID ISetID PersistentMemoryType Capacity FreeCapacity HealthState
0 0x59727f4821b32ccc AppDirect 750.0 GiB 0.0 GiB Healthy
  • The NVDIMM regions must be turned into fsdax namespaces, as in the following example from the same machine:
sudo ndctl list
[
{
"dev":"namespace0.0",
"mode":"fsdax",
"blockdev":"pmem0",
...
}
]

Filesystem configurationโ€‹

The PMem block device must contain a filesystem that is capable of DAX (Direct Access), such as XFS or EXT4. On the machine in the above example, this could be accomplished in the usual way:

XFS filesystem:

sudo mkfs.xfs -f -d su=2m,sw=1 /dev/pmem0

EXT4 filesystem:

sudo mkfs.ext4 /dev/pmem0

Finally, the file system must be mounted with the dax mount option. The dax mount option is important. Without this option, the Linux page cache is involved in all I/O to and from persistent memory, which would drastically reduce performance.

In the following example, we use /mnt/pmem0 as the mount point.

sudo mount -o dax /dev/pmem0 /mnt/pmem0

Remember to make the mount persistent to survive system reboots by adding it to /etc/fstab. The mount point config line can be copied from /etc/mtab to /etc/fstab.

Secondary index on PMemโ€‹

The secondary index type is configured per namespace. To enable a PMem index for a namespace, add a sindex-type subsection with an index type of pmem to its namespace section. The added sindex-type subsection must contain:

  • One or more mount directives to indicate the mount points of the persistent memory to be used for the PMem index.

    A single namespace can use persistent memory across multiple mount points and will evenly distribute allocations across all of them.

    Conversely, mount points can be shared across multiple namespaces. The file names underlying namespaces' persistent memory allocations are namespace-specific, which avoids file name clashes between namespaces when they share mount points.

  • A mounts-budget (or mounts-size-limit before Database 7.0) directive to indicate this namespace's share of the space available across the given mount points.

    When multiple namespaces share mount points, this configuration directive tells Aerospike how much of the total available memory across mount points each namespace is expected to use.

    Ensure mounts-size-limit is lower than or equal to the size of the filesystem.

    If mount points are not shared between namespaces, then simply specify the total available space.

  • The specified value, along with configuration item evict-mounts-pct (or mounts-high-water-pct before Database 7.0), which is disabled by default, forms the basis for calculating the eviction threshold.

The following configuration snippet extends the earlier example and makes all of /mnt/pmem0 memory (for example, 750 GiB) available to the namespace:

Database 7.0 and laterโ€‹

namespace test {
sindex-type pmem {
mount /mnt/pmem0
mounts-budget 750G
}
}

Prior to Database 7.0โ€‹

namespace test {
sindex-type pmem {
mount /mnt/pmem0
mounts-size-limit 750G
}
}

Secondary index on Flashโ€‹

The Aerospike All Flash feature stores secondary indexes on NVMe SSDs.

caution

Cautionsโ€‹

  • While it is advisable to adjust the kernel's min_free_kbytes parameter in any configuration, it is especially important to do so when storing the primary index on flash memory (All Flash). The Linux kernel will attempt to use all free space by caching disk writes. With All Flash configuration, this may result in an out of memory (OOM) kill if there isn't enough free memory left for normal system operations. For this reason, Aerospike recommends setting min_free_kbytes=1153434 (1.1GB). For more information, see How to Tune the Linux Kernel.

All Flash kernel parametersโ€‹

note

The following Linux kernel parameters are required in an All Flash deployment. enforce-best-practices verifies that these kernel parameters have the expected values.

/proc/sys/vm/dirty_bytes = 16777216
/proc/sys/vm/dirty_background_bytes = 1
/proc/sys/vm/dirty_expire_centisecs = 1
/proc/sys/vm/dirty_writeback_centisecs = 10
  • When running as non-root, you must prepare these values before running the Aerospike server.
  • When running as root, the server configures them automatically.

Either way, if these parameters can't be correctly set manually, or automatically by the server, the node will not start.

Enable flash secondary index for a namespaceโ€‹

To enable a flash secondary index for a namespace, in the configuration file, add a sindex-type subsection with an index type of flash to its namespace section. The added sindex-type subsection must contain:

  • One or more mount directives to indicate the mount points on the flash storage to be used for the flash sindex.

    A single namespace can use flash sindex storage across multiple mount points and will evenly distribute allocations across all of them.

    Conversely, mount points can be shared across multiple namespaces. The file names underlying namespaces' flash sindex allocations are namespace-specific, which avoids file name clashes between namespaces when they share mount points.

    • A mounts-budget (or mounts-size-limit before Database 7.0) directive to indicate this namespace's share of the space available across the given mount points.

    When multiple namespaces share mount points, this configuration directive tells Aerospike how much of the total available memory across mount points each namespace is expected to use.

    Ensure mounts-budget/mounts-size-limit is smaller or equal to the size of the filesystem mount.

    If mount points are not shared between namespaces, then simply specify the total available space.

  • The specified value, along with configuration item evict-mounts-pct (or mounts-high-water-pct before Database 7.0), which is disabled by default, forms the basis for calculating the eviction threshold.

Aerospike recommends an XFS file system because it has been shown to provide better concurrent access to files compared to EXT4.

Recommendation for multiple physical devicesโ€‹

  • Having more physical devices increases parallelism across devices and improves performance.

  • More partitions per physical device doesn't necessarily improve performance.

  • Aerospike instantiates at least 4 different arena allocations (files) and will allocate more if more devices (logical partitions or physical devices) are present.

*Instantiating more than 1 arena at a time helps with contention against the same arena, which is important during heavy insertion loads.

Database 7.0 and laterโ€‹

namespace test {
sindex-type flash {
mount /mnt/nvme0
mount /mnt/nvme1
mount /mnt/nvme2
mount /mnt/nvme3
mounts-budget 1T
}
}

Prior to Database 7.0โ€‹

namespace test {
sindex-type flash {
mount /mnt/nvme0
mount /mnt/nvme1
mount /mnt/nvme2
mount /mnt/nvme3
mounts-size-limit 1T
}
}

Short query optimizationsโ€‹

Short queries have a short runtime duration and return a small number of records. Aerospike Database can apply several optimizations to improve the latency and throughput of short queries.

On the application side, use the client's QueryPolicy.expectedDuration (or QueryPolicy.shortQuery in older clients) to differentiate between short and long queries. By default, all queries are understood to be 'long'.

Database 6.3 added the namespace configuration inline-short-queries. As long as your short queries consistently return a small number of records, and if your use case prioritizes short query latency over single-record transaction latency, you should consider setting this configuration to true.

To leverage further built-in optimizations for short queries, you should upgrade your cluster to Database 6.4 or higher.