Capacity Planning for Specific Data Types
Summaryโ
Type | In Memory | In Memory Indexing | On Disk | On Disk Metadata |
---|---|---|---|---|
Boolean | 0 | n/a | 1 | n/a |
Integer/float | 0 | n/a | 0-255: 1, 256-64K: 2, 64K-4B: 4, 64k-2^64: 8 | n/a |
String | string-len | n/a | string-len | n/a |
GeoJSON | string-len + 12 | n/a | string-len + 12 | n/a |
List | 10 + msgpack-array 1 | โelement-count / 128โ * 4 | msgpack-array | n/a |
Map | msgpack-map | msgpack-ext + 1 | msgpack-map | 4 2 |
HyperLogLog | 11 + hll | n/a | 11 + hll | n/a |
Note: All sizes are in bytes unless otherwise noted.
Listโ
The list data type is serialized as a MessagePack array, with 1, 3 or 5 header bytes, and each element serialized as well.
Exampleโ
For a list of 3 integer elements [0, 1000, 255]
:
1 byte header for 3 elements
+1 byte for integer 0
+3 byte for integer 1000
+2 byte for integer 255
1 + 1 + 3 + 2 = 7 bytes.
If this list is stored in-memory, we need to add 10 bytes for metadata.
Mapโ
The map data type is serialized as a MessagePack map, with 1, 3 or 5 header bytes, and with map-key/map-value pairs serialized as well.
On Disk Metadataโ
When Aerospike maps are stored on disk, there is a flat 4 byte cost to the associated metadata, unless the map is unordered. There is no advantage to choosing to use an unordered map, and key ordered has better performance. See Development guidelines and tips.
Exampleโ
A K-ordered map with 3 elements {a: 1, bb: 2000, ccc: 300000}
1 byte header for 3 pairs
2 bytes for 'a' and 1 byte for 0
3 bytes for 'bb' and 3 bytes for 2000
4 bytes for 'ccc' and 5 bytes for 300000
1 + 3 + 6 + 9 = 19 bytes for the data itself + 4 bytes metadata = 23 bytes.
In Memory Indexingโ
When Aerospike maps are stored in an in-memory namespace, an additional amount of memory storage is taken up by key and value indexes.
msgpack-ext = header + offset-index + value-index
index = element-count * size/element
element-count = number of elements in the map
Type | Indexes |
---|---|
unordered | None |
key ordered | offset |
key and value ordered | offset + value |
Index Size/Elementโ
var3 | size/element |
---|---|
< 2^8 | 1 |
< 2^16 | 2 |
< 2^24 | 3 |
>= 2^24 | 4 |
HyperLogLogโ
The HyperLogLog data type has an array of 2^n_index_bits registers.
Each register contains 6 bits of HyperLogLog value and n_minhash_bits optional bits of MinHash value. Adding MinHash bits enables HyperMinHash functionality, a superset of HyperLogLog.
The storage size of the registers is rounded up to the nearest byte.
hll = 11 bytes + roundUpToByte(2^n_index_bits * (6 + n_minhash_bits))
Exampleโ
A HyperLogLog bin with 12 registers uses the following approximate memory, where 8 bits in a byte is the rounding factor:
11 bytes + ((2^12 * 6) bits / 8) = 3083 bytes.
2 No metadata if map is unordered. [โฉ](#ref2)
3 *var* is msgpack-size for offset-index and element-count for value-index. [โฉ](#ref3)