Primary Index
Overview
Aerospike's primary key index is a blend of distributed hash table technology with a distributed tree structure in each server. The entire keyspace in the namespace is separated using a robust hash function into partitions. A total of 4096 partitions are equally distributed across cluster nodes. See data-distribution for details on hashing and partitioning.
Aerospike uses a red-black tree structure called a sprig. You can configure the number of sprigs for each partition. Configuring the right number of sprigs is a trade-off between extra space overhead and optimized parallel access.
Where sprigs are stored is determined by the index-type
configuration parameter. For more information, see Index storage.
Most Aerospike deployments use hybrid storage,
with indexes in memory and data on SSD.
The primary index is on the 20 byte hash called the digest of the specified primary key. While this expands the key size of some records (for example, an integer key which is only 8-bytes), it is beneficial because code operation is predictable regardless of input key size or distribution.
When a server fails, the indexes on another server are immediately available. If the failed server remains down, data starts rebalancing, and replicated indexes are built on new nodes.
Index metadata
Currently, each index entry requires 64 bytes. In addition to the 20-byte digest, the following metadata are also stored in index.
- Generation count: Tracks all writes to the record; used for resolving conflicting updates.
- Expiration time or TTL: Tracks time when a key expires. The eviction subsystem uses this metadata.
- Last Update Time: Tracks the last writes to the key (Citrusleaf epoch). Used for conflict resolution during cold restart, conflict resolution during migration (depending on your configuration settings), Filter Expressions, incremental backup scans, truncate and truncate-namespace commands.
Index persistence
The primary index is derived from the data itself and can be rebuilt from that data, depending on the configuration setting for fast restart (AKA warmstart).
Fast restart feature
Aerospike's fast restart feature enables upgrades with minimal downtime in Aerospike Database Enterprise Edition (EE) and Aerospike Database Standard Edition (SE). Fast restart allocates index memory from a shared memory segment (shmem). For planned shutdowns and restarts, for an upgrade for example, the server re-attaches to the shared memory segment and activates the primary indexes on restart without a data scan of the storage.
To enable fast restarts, set the index-type
configuration
parameter to shmem
(shared memory) or pmem
(persistent memory).
Index storage
Where the server stores a primary index is determined by the index-type
configuration parameter. The following options are available:
Type | Description |
---|---|
shmem | Linux shared memory. |
flash | A block storage device (typically NVMe SSD). |
pmem | Persistent memory (e.g. Intel Optane DC Persistent Memory). |
The index-type
configuration option is available only in Aerospike Database Enterprise Edition (EE).
Community Edition (CE) stores primary and secondary indexes in volatile process memory.
For more information about primary index storage methods, see Configure the Primary Index.