Skip to main content
Loading

Hybrid storage

Overview

Aerospike Database's hybrid storage gives you control over the storage media used to store data, the primary index and secondary indexes.

Storage media

Storage media for data or indexes includes:

  • Dynamic random access memory (DRAM), which we'll simply refer to as memory.
  • Non-volatile memory extended (NVMe) Flash, AKA solid state drives (SSDs).
  • Intel Optane™ Persistent Memory (PMem).

In Enterprise Edition (EE) memory refers to shared-memory, which persists between restarts of the Aerospike Database daemon process (asd). In Community Edition (CE) memory refers to process memory, which is volatile, and goes away on asd shutdown.

Storing the primary index in memory and data on SSD is the default configuration for Aerospike, and is referred to as a hybrid memory architecture (HMA) deployment.

Analyzing hybrid storage for your needs

The diagram below visualizes a few hybrid storage configurations.

Hybrid storage engines

This table provides a high-level comparison of hybrid storage options.

 Data Storage
SSDMemoryPMem*
Primary indexSSD*Lowest cost. Most efficient for datasets with tiny objects.DisallowedDisallowed
MemoryBest cost-performance. High performance.High performance. Fast restarts in EE, except when rebooting the host.Uncommon
PMem*Fast restarts after reboot. High performance.Not recommended.Fast restarts after reboot. Higher performance.
info
* The starred options are only available in EE.

About namespaces, records, and storage

Each namespaces can have a different storage media configuration. For example, you can configure small, frequently accessed namespaces in memory and put larger namespaces in less expensive, high performance SSDs.

In Aerospike:

  • Record data is stored contiguously.
  • A record can be as large as 8MiB.
  • New record versions are always copy-on-write persisted.
  • Free space is continuously reclaimed through defragmentation.
  • Each namespace has a fixed amount of storage, each node gets an equal distribution of data, and requires the same amount of storage as other cluster nodes.

Aerospike achieves high reliability by storing multiple copies of each record. Since Aerospike automatically reshards and replicates data on failure or during cluster node management, k-safety is maintained at a high level.

Aerospike uses random data distribution to keep data unavailability very small when several nodes are lost. For example, in a 10-node cluster with two copies of the data, if two nodes are simultaneously lost, the amount of unavailable data before replication is approximately 2% or 1/50th of the data.

The Aerospike defragmenter tracks the number of active records on each block in data storage and reclaims blocks that fall below a minimum level of use.