# Capacity planning guide

This page describes how to calculate the capacity requirements of your namespaces.

In Aerospike Database Enterprise and Standard Editions:

-   Namespace data and metadata are stored in the data storage engine.
-   Metadata is stored in the [primary index](https://aerospike.com/docs/database/8.0.0/learn/architecture/data-storage/primary-index.md).
-   Optional bin data is stored in [secondary indexes](https://aerospike.com/docs/database/8.0.0/learn/architecture/data-storage/secondary-index.md).

The data storage engine, the primary index and secondary indexes can be configured as follows:

-   In-memory data is stored in shared memory (shmem).
-   The primary index, and secondary indexes can be configured independently to use solid state drives (SSD), shared memory (shmem), or Intel Optane Persistent Memory (PMem).
-   Optional [set indexes](https://aerospike.com/docs/database/8.0.0/learn/architecture/data-storage/set-index.md) are always stored in memory.

Aerospike Database Community Edition (CE) is limited to storing namespace data on SSD or in volatile process memory; CE primary and secondary indexes store their metadata in process memory.

### Required memory

Provision enough memory to avoid losing a node to an out-of-memory (OOM) crash. Calculate the memory you need for the OS, namespace overhead, and other software running on the machine.

Enough memory should be reserved for the OS, namespace overhead, and other software running on the machine. You can reserve memory with a configurable stop-writes threshold for namespaces, and the option to configure an eviction threshold. See [Configuring namespace data retention](https://aerospike.com/docs/database/8.0.0/manage/namespace/retention.md) for details.

::: caution
For versions prior to Database 7.0.0, verify that the combined [`memory-size`](https://aerospike.com/docs/database/reference/config#namespace__memory-size.md) of your namespaces does not exceed the available RAM on the machine.
:::

### Capacity considerations with transactions

For namespaces in CP mode (strong consistency) with transactions enabled, if the transaction had any writes, extra tombstones are created at the rate of transactions ended per-second and tombstone configuration such as [`tomb-raider-period`](https://aerospike.com/docs/database/reference/config#namespace__tomb-raider-period.md) should be considered.

The monitor record is removed by a durable delete when done. All deletes within a transaction must be durable deletes as well. With extreme transaction throughput (overall transactions, not individual writes within a transaction) the tombstone accumulation could be significant. Default tomb-raider configuration values may not be sufficient to clear these, and more aggressive settings might be necessary.

Other than tombstones considerations, transactions are short-lived and transient, and don’t have significant storage implications. Be aware that transactions consume extra IO, which are detailed in the [FAQ](https://aerospike.com/docs/database/reference/faq#what-are-the-extra-costs-associated-with-transactions.md).

## Calculating primary index storage

Metadata is stored in the primary index. The calculation for primary index storage is:

``` plaintext
    64 bytes × (replication factor) × (number of records)
```

-   Each record uses 64 bytes of metadata in the primary index.

-   The [replication factor (RF)](https://aerospike.com/docs/database/8.0.0/learn/architecture/clustering/data-distribution.md) is the number of copies each record has within the namespace. The default [`replication-factor`](https://aerospike.com/docs/database/reference/config#namespace__replication-factor.md) for a namespace is 2 - a master copy and a single replica copy for each record.

See [Primary index configuration](https://aerospike.com/docs/database/8.0.0/manage/namespace/primary-index.md) for more configuration details.

### Primary index on flash

It is important to understand the subtleties of All Flash sizing, specifically the issue of sprigs. The primary index of each namespace is partitioned into 4096 partitions, and each partition is structured as a group of shallow red-black trees called [sprigs](#primary-index-on-flash). Scaling up an All Flash namespace may require an increase of [`partition-tree-sprigs`](https://aerospike.com/docs/database/reference/config#namespace__partition-tree-sprigs.md), which would require a rolling [cold restart](https://aerospike.com/docs/database/8.0.0/manage/database/cold-start.md). Additional nodes increase capacity, but performance can be impacted as sprigs fill up and overflow their initial 4KiB disk allocation.

When you configure a namespace with [`index-type flash`](https://aerospike.com/docs/database/reference/config#namespace__index-type.md) (All Flash), the 64 bytes of record metadata is stored as part of a 4KiB block on an index device. Each sprig held by a node consumes 10 bytes of RAM, or 13 bytes per sprig prior to Database 5.7.0.

To reduce the number of read operations to the index device, consider the ‘fill fraction’ of an index block. For optimal performance, each sprig should contain fewer than 64 records because 64 x 64B is 4KiB.

If the namespace is projected to grow rapidly, use a lower fill fraction to leave room for future records. Full sprigs span more than a single 4KiB index block, and will likely require more than a single index device read. Modifying the number of sprigs to mitigate such a situation requires a cold start to rebuild the primary index, so it’s better to determine the fill factor in advance.

``` plaintext
    sprigs per partition = (total unique records / (64 x fill fraction)) / 4096
```

`partition-tree-sprigs` must be a power of 2, so whatever this calculation yields, pick the nearest power of 2.

For example, with 4 billion unique objects, and a fill factor of 1/2, the sprigs per partition should be:

``` plaintext
    (4 x 10^9) / (64 x 0.5) / 4096 = ~ 30,517 -> nearest power of 2 = 32,768
```

When calculating the number of required sprigs, calculations must also be made to ensure the correct amount of space is provided on the disk for the primary index. This primary index size, configured in [`mounts-budget`](https://aerospike.com/docs/database/reference/config#namespace__mounts-budget.md) in Database 7.0.0, and previously [`mounts-size-limit`](https://aerospike.com/docs/database/reference/config#namespace__mounts-size-limit.md), should then be adjusted. To do so, apply the following formula to get the minimum size needed for this configuration parameter.

``` plaintext
    primary index size = ((4096 x replication-factor / min-cluster-size) x partition-tree-sprigs) x 4KiB
```

To explain the equation above, the `mounts-budget` (or `mounts-size-limit`) should be 4096 (the number of master partitions), multiplied by `replication-factor` to get the total number of partitions, divided by the minimum cluster size ([`min-cluster-size`](https://aerospike.com/docs/database/reference/config#service__min-cluster-size.md)) that you will have, to get the partitions per node maximum, multiplied by the number of sprigs (`partition-tree-sprigs`) to get the maximum number of sprigs per node, and then multiplied by 4KiB (as each sprig occupies a minimum of 4KiB). This should be the minimum usable mount size for your primary indexes. You should also take into account the file system overhead when partitioning the disk for the All Flash mounts.

In addition, when shutting down, the sprig roots (5 bytes per sprig) get rewritten for optimizing the subsequent fast restart. You must allow for sufficient disk space available for this, or the node will not shut down cleanly. The following formula calculates this extra disk space requirement for the cluster:

``` plaintext
    5 bytes x partition-tree-sprigs x 4096 x replication-factor
```

Using this example, this would be:

``` plaintext
    5 x 32,768 x 4096 x 2 = 1342177280 bytes or 1.25GiB
```

This space does not have to be included within `mounts-budget` (or `mounts-size-limit`).

If the size of the primary index exceeds 2TiB, you must change [`index-stage-size`](https://aerospike.com/docs/database/reference/config#namespace__index-stage-size.md) from the default value of 1GiB. Index space is allocated in arenas, the size of which are defined by the `index-stage-size` configuration parameter. The maximum number of arenas is 2048, so if the index needs to be bigger than 2TiB the `index-stage-size` must be increased.

::: caution
When filling up records with an index on flash, Aerospike instantiates sprigs that consume 4KiB each, and index disk usage can climb rapidly. In this scenario, and with an improper configuration, it is possible to run out of space with very few records.

For more information, see [this FAQ](https://support.aerospike.com/s/article/FAQ-Why-does-index-disk-usage-climb-so-rapidly-with-all-flash) and consider the following example.
:::

Each sprig requires 10 bytes of RAM overhead, so:

``` plaintext
    total sprigs overhead = 10 bytes x total unique sprigs x replication factor
                          = 10 bytes x ((number of records x replication factor) / (64 x fill fraction))
```

The total sprigs are then divided evenly over the number of nodes.

Using the previous example, the amount of memory consumed by the primary index sprigs is:

``` plaintext
    10 bytes x 32768 x 4096 x 2 = 2.5GiB
```

Prior to Database 5.7.0:

``` plaintext
    13 bytes x 32768 x 4096 x 2 = 3.25GiB
```

With 4 billion objects and a replication factor of 2 (RF2), the memory consumed in association with the primary index across the cluster in All Flash is 2.5GiB. Using the same example in a Hybrid Memory configuration, where the primary index is in memory, 476.8GiB of memory would be used.

## Calculating storage for the set index

::: note
Set indexes are always stored in memory.

See [Adding and removing a set index](https://aerospike.com/docs/database/8.0.0/manage/namespace/sets#adding-a-set-index.md) for more configuration details.
:::

-   For each set index, 4MiB x RF of memory overhead is used, distributed across the cluster nodes.
-   Each record in an indexed set costs 16 bytes of memory, multiplied by the namespace replication factor.
    -   16MiB x RF of memory is pre-allocated for each set index, divided across the cluster nodes, as soon as the set index is created. This allocation is reserved for the first million records in the set.
    -   Memory for indexing additional records in the set is allocated in 4KiB micro-arena stage increments. Each additional 4KiB micro-arena stage enables set-indexing of 256 records in a specific partition.

Example If a namespace has 1000 sets, each with a set index, and a RF2:

-   The overhead is `4MiB x 1000 x 2 = 8GiB`, divided across the nodes of the cluster.
-   The initial stage pre-allocates `16MiB x 1000 x 2 = 31.25GiB`, also divided across the nodes of the cluster.
-   Once the number of records in an indexed set passes one million, an additional 4KiB (holding up to 256 records) is allocated in the partition that’s being written to.

## Calculating secondary index storage

See [Secondary index capacity planning](https://aerospike.com/docs/database/8.0.0/manage/planning/capacity/secondary-indexes.md) for how to calculate storage needs.

## Calculating data storage

::: note
-   Starting in Database 7.1.0, you can limit how much memory the indexes are allowed to use with [`indexes-memory-budget`](https://aerospike.com/docs/database/reference/config#namespace__indexes-memory-budget.md).

-   Starting in Database 7.0.0, in-memory data storage is pre-allocated and static.

-   Prior to Database 7.0.0, data storage grew progressively and was bound by the `memory-size` configuration parameter, which was removed in Database 7.0.0. In-memory namespaces had a distinct storage format. To calculate the memory size associated with a pre-7.0.0 in-memory namespace [see Calculating in-memory data storage prior to Database 7.0.0](#calculating-in-memory-data-storage-prior-to-database-700).
:::

The storage requirement for a single record is the sum of the following:

-   Overhead for each record:

    `39 bytes` in Database 6.0.0 and later. Four bytes were added using a record end mark. Prior to Database 6.0.0, the overhead was `35 bytes`.

-   If using a non-zero void-time (TTL). Tombstones have no expiration:

    `+ 4 bytes`

-   If using a set name:

    `+ 1 byte overhead + set name length in bytes`

-   If storing the record’s key. Flat key size is the exact opaque bytes sent by client:

    -   `1-3 bytes overhead`
        -   `1 byte for key size <128 bytes`
        -   `2 bytes if 128 bytes <= key size < 16KB`
        -   `3 bytes if key size >= 16KB`
    -   `1 byte (key type overhead)` `+ flat key size`

-   Bin count overhead. No overhead for single-bin (removed in 6.4) and tombstone records:

    `+1 byte for count < 128, +2 bytes for < 16K, or +3 bytes for >= 16K`

-   General overhead for each bin. No overhead for single-bin:

    `+ 1 byte + bin name length in bytes` and

    `+ 6 bytes` for LUT depending on XDR [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md) and

    `+ 1 byte` for src-id if XDR [bin convergence](https://aerospike.com/docs/database/8.0.0/manage/xdr/convergence#overhead.md) is enabled.

-   Type-dependent overhead for each bin:

    `+ 1 byte for bin tombstone` (see [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md)) or

    `2 bytes + (1, 2, 4, 8 bytes) for integer data values 0 to 255, 256 to 64KiB - 1, 64Ki to 4GiB - 1, 4GiB to 16EiB` or

    `+ 1 byte + 1 byte for boolean data values` or

    `+ 1 byte + 8 bytes for double data` or

    `+ 5 bytes + data size for all other data types`

This resulting storage size should then be rounded up to a multiple of 16 bytes. For example, in Database 6.0.0 and later, a tombstone record with a set name 10 characters long and no stored key we need:

``` bash
  39 + (1 + 10) = 50 -> rounded up = 64 bytes
```

Or for a record in the same set, no TTL, two bins (8 character names) containing an integer and a string of 20 characters:

``` bash
  39 + (1 + 10) + 1 + (2 × (1 + 8)) + (2 + 8) + (5 + 20) = 104 -> rounded up = 112 bytes
```

::: note
Aerospike reserves 8 write-blocks of storage per device when using `storage-engine device` or `storage-engine memory` (data on SSD or data in memory with storage-backed persistence). This means 64 MiB (`8 * 8 MiB`) of storage is reserved per device. The recommended minimum value is 128 MiB.

Prior to 7.1.0, the `filesize` parameter needs to be at least 8MiB + 2 \* `write-block-size`
:::

### Defragmentation considerations

Your storage engine needs a portion of the total storage space available to the namespace for [defragmentation](https://aerospike.com/docs/database/8.0.0/manage/namespace/storage/defrag.md), as determined by the [`defrag-lwm-pct`](https://aerospike.com/docs/database/reference/config#namespace__defrag-lwm-pct.md) configuration parameter.

By default, you should plan to use no more than 50% on your storage space. Raising the `defrag-lwm-pct` makes more space accessible to data storage, at the cost of more CPU when using an in-memory namespace in Database 7.0.0 and later, or device IO for data on SSD or PMem.

Write-blocks that are still in the `post-write-cache` (Database 7.1.0 and later) or the `post-write-queue` (prior to Database 7.1.0) are not candidates for defragmentation, even if the percent of live records in those write-blocks drops below the `defrag-lwm-pct`.

Also, wblocks in the `post-write-cache` or `post-write-queue` are not eligible to be defragmented. The `post-write-cache` should be kept small compared to the overall device size as the size allocated to the `post-write-cache` will not be defragmented.

### Calculating in-memory data storage prior to Database 7.0.0

Prior to Database 7.0.0, a namespace configured to store data in memory had the following calculation:

-   Overhead for each record:

    `2 bytes`

-   If the [key](https://aerospike.com/docs/database/8.0.0/learn/architecture/data-storage/data-model#records.md) is saved for the record:

    `+ 12 bytes overhead + 1 byte (key type overhead) + (8 bytes (integer key)` ` OR length of string/blob (string/blob key))`

-   General overhead for each bin:

Either
`+ 12 bytes`(Aerospike Database prior to 5.4.0, or XDR [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md) set so as to incur overhead)
or
`+ 11 bytes` (Aerospike Database 5.4.0 or later and XDR [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md) not incurring overhead)

and
`+ 6 bytes` for LUT if XDR [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md) is set so as to incur overhead.

`+ 1 byte` for src-id if XDR [bin convergence](https://aerospike.com/docs/database/8.0.0/manage/xdr/convergence#overhead.md) is enabled.

-   Type-dependent overhead for each bin:

    `+ 0 bytes for bin tombstone` (see [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md)) or

    `+ 0 bytes for integer, double or boolean data` or

    `+ 5 bytes for string, blob, list/map, geojson data`

-   Data: size of data in all the record’s bins (0 bytes for integer, double and boolean data, which is stored by replacing some of the general overhead).

    `+ data size`

For example, for a record with two bins containing an integer and a string of length 20 characters, and Aerospike Database prior to 5.4.0, we find:

``` bash
    2 + (2 × 12) + (0 + 0) + (5 + 20) = 51 bytes.
```

Or for the same type of record, and Database 5.4.0 or later (and [`bin-policy`](https://aerospike.com/docs/database/8.0.0/manage/xdr/bin-policy#overhead.md) not incurring overhead), we find:

``` bash
    2 + (2 × 11) + (0 + 0) + (5 + 20) = 49 bytes.
```

This memory is actually split into different allocations — the record overhead plus all general bin overhead are in one allocation, and the type-dependent bin overhead plus data are in separate allocations per bin.

::: note
Integer data does not need the per-bin allocation. The system heap rounds allocation sizes, so there may be a few more bytes used than the above calculation implies.
:::

## List

The list data type is serialized as a [MessagePack array](https://github.com/msgpack/msgpack/blob/master/spec.md#array-format-family), with 1, 3 or 5 header bytes, and each element serialized as well.

Example

For a list of 3 integer elements `[0, 1000, 255]`:

-   `1 byte header for 3 elements`

-   `+1 byte for integer 0`

-   `+3 byte for integer 1000`

-   `+2 byte for integer 255`

    1 + 1 + 3 + 2 = 7 bytes.

If this list is stored in-memory, we need to add 10 bytes for metadata.

## Map

The map data type is serialized as a [MessagePack map](https://github.com/msgpack/msgpack/blob/master/spec.md#map-format-family), with 1, 3 or 5 header bytes, and with map-key/map-value pairs serialized as well.

### On disk metadata

When Aerospike maps are stored on disk, there is a flat 4 byte cost to the associated metadata, unless the map is unordered. There is no advantage to choosing to use an unordered map, and key ordered has better performance. See [Development guidelines and tips](https://aerospike.com/docs/develop/data-types/collections/map#development-guidelines-and-tips.md).

Example

A K-ordered map with 3 elements `{a: 1, bb: 2000, ccc: 300000}`

-   `1 byte header for 3 pairs`

-   `2 bytes for 'a' and 1 byte for 0`

-   `3 bytes for 'bb' and 3 bytes for 2000`

-   `4 bytes for 'ccc' and 5 bytes for 300000`

    1 + 3 + 6 + 9 = 19 bytes for the data itself + 4 bytes metadata = 23 bytes.

## HyperLogLog

The HyperLogLog data type has an array of 2\^**n\_index\_bits** registers.

Each register contains 6 bits of HyperLogLog value and **n\_minhash\_bits** optional bits of MinHash value. Adding MinHash bits enables HyperMinHash functionality, a superset of HyperLogLog.

The storage size of the registers is rounded up to the nearest byte.

`hll = 11 bytes + roundUpToByte(2^n_index_bits * (6 + n_minhash_bits))`

Example

A HyperLogLog bin with 12 registers uses the following approximate memory, where 8 bits in a byte is the rounding factor:

``` plaintext
11 bytes + ((2^12 * 6) bits / 8) = 3083 bytes.
```

## Throughput (bytes)

``` plaintext
    (number of records to be accessed in a second) × (the size of each record)
```

Calculate throughput so that the cluster continues to work even if one node goes down, ensuring that each node can handle the full traffic load.

## Provisioning

See [Provisioning a cluster](https://aerospike.com/docs/database/8.0.0/manage/planning/capacity/provisioning.md) for examples.

## Pre- 7.0.0 In-memory indexing

When Aerospike maps are stored in an in-memory namespace, an additional amount of memory storage is taken up by key and value indexes.

-   msgpack-ext = header + offset-index + value-index
-   index = element-count \* size/element
-   element-count = number of elements in the map

| Type                  | Indexes        |
|-----------------------|----------------|
| unordered             | None           |
| key ordered           | offset         |
| key and value ordered | offset + value |

#### Index size/element

*var* is msgpack-size for offset-index and element-count for value-index.

| var         | size/element |
|-------------|--------------|
| &lt; 2\^8   | 1            |
| &lt; 2\^16  | 2            |
| &lt; 2\^24  | 3            |
| &gt;= 2\^24 | 4            |

## Pre-6.4.0 single-bin namespaces

Single-bin namespaces were removed in Database 6.4.0. See the [special upgrade instructions](https://aerospike.com/docs/database/8.0.0/advanced/special-upgrades/640-upgrade.md).

### Memory required for data in single-bin namespaces

If a namespace configured to store data in memory is also configured as `single-bin true`, the record overhead and the general bin overhead (the first allocation) described above are not needed — this overhead is stored in the index. The only allocation needed is for the type-dependent overhead plus data. Therefore, numeric data (integer, double) and booleans have no memory storage cost — both the overhead and data are stored in the index. If it is known that all the data in a single-bin namespace is a numeric data type, the namespace can be configured to indicate this by setting `data-in-index true`. This will enable fast restart for this namespace, despite the fact that it is configured to store data in memory.

## Capacity planning for specific data types

  ----------------------------------------------------------------------------------------------------------
  Type                    On Storage (Memory/Device/File) (Bytes)         On Disk Metadata (Bytes)
  ----------------------- ----------------------------------------------- ----------------------------------
  Bin tombstone           1                                               n/a

  Boolean                 1                                               n/a

  Float                   9 bytes including the bin type byte             n/a

  GeoJSON                 string-len + 12                                 n/a

  HyperLogLog             11 + hll                                        n/a

  Integer                 0-255: 1, 256-64K: 2, 64K-4B: 4, 64k-2\^64: 8   n/a

  List                    msgpack-array                                   n/a

  Map                     msgpack-map                                     4
                                                                          No metadata if map is unordered.

  String                  string-len                                      n/a
  ----------------------------------------------------------------------------------------------------------