Frequently Asked Questions (FAQ)
General information
Is Aerospike based on an Open Source product?
No. Aerospike was developed from the ground up to be a highly scalable, low-latency, enterprise-class distributed database. Aerospike Database Community Edition (CE) is a free, open source version that was first released in 2014. Aerospike Database Enterprise Edition (EE), and Standard Edition (SE) are built on top of Aerospike CE, with added enterprise features. While CE shares the same developer APIs as EE and SE (with the exception of durable deletes), they differ in scalability, security, ease of operation, connectivity, and much more. Refer to product matrix for details.
Do I need a trial key for Aerospike Database?
Starting with Database 6.1, Enterprise Edition (EE) comes bundled with a single-node, all-features key file for evaluation, and starts up in the evaluation mode unless a different feature key file is configured. You can download Aerospike EE and immediately start using it without any further steps.
Enterprise customers receive separate feature key files for their production and development environments. For a free multi-node EE evaluation see Try Now.
You do not need a feature key file to use the Community Edition of Aerospike Database (CE).
Can I use Community Edition (CE) and Enterprise Edition (EE) at the same time?
No. Aerospike's license agreement with its enterprise customers prohibits running CE clusters alongside EE clusters.
What is the upgrade path from Aerospike CE to EE? Is downtime required during the upgrade?
No downtime is required when you upgrade from CE to EE or SE through a rolling upgrade (one node at a time). See Upgrade/Repair Server and Upgrading from Community to Enterprise version.
How is unique data counted?
Aerospike charges primarily by unique production data. This means that development, testing, staging, and failover clusters are not included.
A production cluster's unique data is roughly the uncompressed data stored in each namespace divided by its replication factor, excluding index metadata. See the Unique Data Agent for more details.
An enterprise customer's unique data is the sum of each production cluster's unique data. The method of writing data to a production cluster (local client, connector or XDR writes) does not change the method by which unique data is counted.
Hardware
What hardware does Aerospike support?
In production, Aerospike database runs on servers powered by 64-bit Intel or, as of Database 6.2, ARM processors running a Linux operating system. Developers can run Aerospike database on compatible processors using Docker in their macOS and Windows development environments, such as, Apple, Intel, and M1 MacBook Pro laptops.
Refer to the System Requirements at Planning Deployment.
Which versions of ARM does Aerospike Database support?
Aerospike runs natively on Linux operating systems, such as Red Hat Enterprise Linux, Ubuntu, Debian, and Amazon Linux 2023. Aerospike ARM port was tested on AWS Graviton2 EC2 instances.
- In Database 6.2, Aerospike introduced support for ARM processors compatible with the ARMv8.2-A instruction set (Neoverse N1 microarchitecture).
- Some RHEL 7 and RHEL 8 variants for 64-bit ARM have changed their default kernels to a 64K memory page size, and are therefore no longer supported. Aerospike users with deployments on these operating system versions must upgrade to one whose kernel has a 4K page size. Amazon Linux 2, Amazon Linux 2023 for 64-bit ARM are safe.
How do I load balance across the nodes in a cluster?
Aerospike automatically distributes both data and traffic to all the nodes in a cluster. There is no need to add additional load balancing. In fact, load balancers tend to cause interruption and reduce performance.
Can I have mixed hardware configurations for nodes?
Yes. Aerospike does not require that each node be the same hardware. However, the cluster randomly and evenly distributes data across the nodes, so the cluster is limited to the node performance with the least capacity.
After a power outage, should all the nodes be restarted at the same time?
If the application that uses the Aerospike database is not running (for example, if the whole cluster is down), you should be able to bring the cluster nodes up all at the same time. Otherwise, bring up one cluster node at a time, while disabling migrations. As soon as a node joins the cluster, move to bring up the next cluster node.
When a node goes down, how do you reconfigure the system to reroute traffic?
You don’t. With Aerospike, the cluster automatically detects when a node has left the cluster. It automatically responds by rebalancing data and changing the configuration so that the clients know how to communicate with the cluster.
Database
Map key restrictions
Starting in Database 7.1, map keys are restricted to simple types -- integer, string, and blob.
In what programming language is Aerospike written?
Aerospike is written in C
for performance and predictability. Aerospike controls the use of memory and is not affected from garbage collection issues, an issue common to products written in other languages, such as Java.
What programming languages does Aerospike support?
You can learn about our client libraries, read developer blogs, and access the sandbox at the Developer Hub.
Can I run multiple instances of the Aerospike server on one machine?
On a server with a multi-socket CPU, you can run multiple instances of Aerospike, each pinned to a unique NUMA node. See the Knowledge Base article, How to run multiple instances of asd with systemd or contact Aerospike support for instructions.
How do I decide how to separate data into namespaces and sets?
In general, you can think of a “set” in Aerospike as you would a “table” in a relational database. You can also think of an Aerospike “namespace” as a “tablespace” in a relational database.
The best way to start is to understand the sets you need. Typically, sets have users, URLs, servers as each record in them. Sets that have similar requirements often belong to the same namespace.
Since Aerospike does not have a set schema, even completely different sets can exist in the same namespace. For example, a namespace may contain sets for people, servers, and URLs. The bins from one namespace do not need to exist in the other namespaces. You may also find that similar items need to be in different namespaces because they have different needs for how the data is synchronized between different data centers. For example, you may want a set of users in one namespace that is synchronized across the US and a set of users in a different namespace that is synchronized across Asia.
How do I define a schema in Aerospike?
With Aerospike, there is no need to define a schema. Every row can have a different set of bins (similar columns in a relational database). In fact, even the same bin in one record, or row, does not have to have the same data type as the same bin in another record. This flexibility allows you to create applications without the limitations inherent in relational databases.
What query language do you support?
Aerospike has its own API to handle requests to the database. These requests are in the form of get/put/updates to the database, and also include atomic operations on data types, such as Integer, Double, Boolean, List, Map, HyperLogLog and Blob. Language specific clients implement the API for C, Java, C#, Python, Go, Node.js, and others.
How does Aerospike distribute data/traffic?
In Aerospike Database every record is randomly assigned to a logical partition and partitions are evenly distributed among the database cluster nodes. As a result, both the data volume and traffic are evenly distributed. Any time a cluster changes, data distribution and redistribution happens automatically.
With other solutions, you must manually redistribute data.
Do you support batch gets and puts?
Yes. Support for batch writes, deletes, and UDFs was added in Database 6.0. Previous versions of the server only supported batch reads.
How do I find an API call to get the key for a given row?
Fix the escaping of Policy.sendKey
to a backtick before and after. Currently it is being closed by a single quote (') so it is not using code formatting.
What is Aerospike Available Percent?
Available Percent is the amount of storage defragmented and the percent available for writing to Aerospike, as a percentage of total space on disk. It is not free disk space. It is the available storage for streaming writes on that particular Aerospike namespace.
If I need to make a configuration change, do I have to restart the server?
No. Generally you can change the configuration of a node dynamically by issuing some commands from the command line. Refer to the Configuration Reference for details. You can make changes to most parameters, even memory settings, while the server is running. However, any permanent changes must be made in the configuration file (/etc/aerospike/aerospike.conf), and they are not read dynamically. Changes in the configuration file are only delivered to the server on restart.
If a node goes down during a read or write, what happens?
It is possible for a cluster node to go down due to hardware failure. The cluster identifies that the node is no longer sending heartbeats, triggering the formation of a new cluster with a new partition map.
- The clients regularly check for state changes in the Aerospike server through a cluster-tending thread. On the next tend interval after the new cluster is formed (default to one second), the clients learn of the new partition map and the new peer list. Until then, the clients may try to perform some operations against a node that is no longer there.
- If a client tries to read from such a node, the operation times out. When this happens, the client automatically tries reading from a node holding a replica of the record. From a coding standpoint, the developer does not need to be aware of this event, as the client handles the additional attempts at communicating with the database.
- If the client tries to write to such a node, the operation times out. What happens next depends on write policies given to the client. The clients can be configured to retry the operation a specified number of times with a given interval between them. The application may choose to catch the timeout and either retry, defer, or ignore.
- When a node is taken down for planned maintenance, it should first be quiesced. This prevents applications from experiencing read and write timeouts. Quiescence is an Enterprise Edition feature.
How do I back up the database? Will this impact performance?
The Aerospike backup program, asbackup, gathers data from all the nodes and puts them into files. This can be done while the cluster is up and serving requests. The machine with the backup does not need to be a node within the cluster, but it must have network access to the cluster. The backup process is configurable and Aerospike has recommendations to ensure that the backup does not affect normal transactions. Most customers can take backups within a few hours with these settings. The backup system runs at a lower priority than the front end service, so a backup can take varying amounts of time.
Is there a way to delete all content from a namespace?
We recommend using asadm's manage truncate
command to perform truncations rather than info commands when possible.
In the Enterprise Edition, truncation is durable and preserves record deletions through a cold-restart.
In the Community Edition, similar to record deletes, records in previously truncated sets are not durable and deletes can return through a cold-start.
Refer to truncate
command at Info Command Reference - Truncate
and also the truncate-namespace
command at Info Command Reference - Truncate Namespace.
Storage
Have you tested any SSDs? Which ones do you recommend?
Aerospike has benchmarked many SSDs with the open-source Aerospike Certification Tool (ACT). ACT validates drives under realistic production conditions. Although some SSDs perform well for a short time, Aerospike has discovered that some may experience issues only after many hours of use. In order to pass Aerospike's strict ACT requirements, an SSD must show excellent performance over an extended period of time. For more information, refer to the flash/SSD certification guide.
Does Aerospike require the use of the TRIM command for flash/SSDs?
No. When an Aerospike namespace is configured to use the filesystem, the filesystem takes care of block management. When an Aerospike namespace is configured to use a raw flash device (SSD), Aerospike controls the device directly. Due to the difference in how they operate, Aerospike has optimized the use of SSDs as a NAND device. These optimizations include functionality similar to the TRIM command. The effect is that the performance is improved and garbage collection gets distributed, while also getting much improved longevity from the drives.
Can I store data in RAM?
Yes. Although Aerospike database makes optimal use of flash storage (SSDs), it can also use RAM and Intel Optane™ Persistent Memory as storage devices. Within the same cluster, you can configure one namespace to store its data in RAM and another namespace to store its data on SSD.
Can I store data on hard disk rather than SSD?
Storing data on a hard disk is not supported. Aerospike database uses many SSD optimizations to achieve predictable low latency. The physical limitations of rotational disks add an unpredictable and unacceptable amount of latency.
How do I calculate the amount of space needed in RAM and/or flash (SSD)?
If you want an exact algorithm for calculating the amount of space needed, refer to the Capacity Planning. We suggest Aerospike customers to contact support regarding capacity planning.
I have used databases in the past that reclaim space using a process called “compaction”. This process takes tremendous resources and sometimes results in instability. How does Aerospike handle reclamation of space?
Aerospike was designed from the start as an enterprise-class, distributed database. Its architecture focuses on using flash drives (SSDs) as primary data storage, without heavy dependence on RAM for performance. Instead of appending writes to large log files and deferring compaction to a CPU and disk I/O intensive operation, Aerospike uses SSD optimized raw-device block writes and lightweight defragmentation. This leads to greater predictability and reliability, without experiencing long latencies often seen in LSM-tree databases.
Storage is split into two areas: the primary index, which is stored either in RAM, Intel Optane™ persistent memory, or on flash, and data, typically configured to be stored on SSDs. When a new record is written to a node, a metadata entry is made in the primary index and its data is streamed in blocks to the SSD. The metadata entry points to the exact device and r-block offset from where the record data is stored contiguously. This facilitates low latency, concurrent reads.
When the record is updated, its primary index entry is updated to point to a different block on the SSD, where the new record data is persisted. If the record is deleted the metadata entry is removed. Since Aerospike does not use a filesystem to store records, it doesn’t need to compact data files. Rather, the defragmentation process continuously reclaims space in small increments, without causing latency spikes.
A separate process traverses the primary index and removes metadata that has aged beyond its configurable time-to-live (TTL). Subsequently defragmentation reclaims storage blocks with no index metadata pointing to them.
How do deletes work?
A standard delete operation (AKA expunge) only removes the record primary index metadata entry. The record data is reclaimed asynchronously from the namespace data storage (typically SSD) through a separate defragmentation process. Defragmented write blocks are later overwritten with new records. Since expunge only removes a 64 byte metadata entry from the index (usually stored in RAM) the approach is fast and optimal for flash devices, which have limited write I/O capacity compared to read I/O.
The alternative durable delete operation writes a new metadata entry to the primary index pointing to a minimal disk storage marker called a tombstone. Tombstones prevent situations where a cold restart of the node may recover previously deleted records based on their yet to be defragmented stale version on SSD.
Download
What is a hotfix?
A hotfix is a patch release of a specific server version that has no API changes and no regressions. For example, you can safely upgrade from server 5.5.0.4 to 5.5.0.25, or the latest hotfix for Database 5.5.0. It builds on the initial release of the server, applying layers of subsequent bug fixes.
How do I manually download Aerospike software?
Download the latest Aerospike Database from Aerospike Downloads. The available packages are limited to the supported database versions with the latest hotfix of each version, if available.
How do I automate database downloads?
To automate Aerospike Database downloads, use the artifact repository, which contains the latest and archived versions of the database. Do not use the URLs associated with the manual download page, as they could change.
The URLs of the artifacts do not change, and are easy to define programmatically. For example you can get
- The newest version of Aerospike EE at https://download.aerospike.com/artifacts/aerospike-server-enterprise/latest/
- The latest Aerospike EE 6.1.0.x release at https://download.aerospike.com/artifacts/aerospike-server-enterprise/6.1.0/
- A specific release at https://download.aerospike.com/artifacts/aerospike-server-enterprise/6.0.0.8/
A new package naming convention will affect download automation for new versions of the server, tools, Prometheus Exporter, C client.
Aerospike clusters are tolerant of heterogeneous database versions, so you can safely roll out hotfixes for the same database version. We recommend that you use the latest hotfix link for the database version you are deploying.
What is the public key for verifying Aerospike packages?
Aerospike Public Key download location: https://download.aerospike.com/artifacts/aerospike_public_key.asc
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v2
mQINBGGmrUsBEAD1MUBVur8nyrGOfZBaFCSwRtkUwA3oVVRvGnDKgTHpnP3JN65N
dYyMvCRcyZR3T+prDO9tNP4ute+xsButp/+SmdVmfPU6GsVnj5K3eFxn0c7J/GjS
LsrHW58DhHMsR++fvUjGFpe1eyI/PSpW3Sgn/tJc0R9IRRi7wP3mUrEritU8mPz2
WgdnBOkTIxfWu7NKDUbgWfyeeXmh0/C/UmeE56aRT143g0ieSeRmg2LZ8oEJKFXu
cN8j10PMOdrWPGRJZhhOHeVpzLHluXftNG9UaEGZS2j6cUxz57x4Or2TaelyPYTv
Q6BL+bWA6OOf7BRFBadEBf6IDFGeqXrjp/sRUfWtj7c3naR4rBxE6KCc5D2chhs+
bM065TlYZwLCIy44oGQouk01qg+rNbPnaJcqjuzN0X5pTaOrqNgiDrwnm9keJiG6
7ITjQCbmQVoPvCpo2MDseuKuPK57LUi8XU/QVPLMiPsFMrGUd0gwqckGiq/szGt8
BCGwgMAXuTUaccz3cOKyxGOd3+kbiYxDlcmQoxXsJxFplOgQx3tEpUfvcKDb8kAI
m2Kwp0hnirSrlnk8jqtJkCREywPw6+phro2pOxZ4Hvsglx2hI4OFeh5s40jep2Ff
EI9dZAW7KTlqF0W+0PFCk1DIiI8P+IXMqFVLs+doJl3kS17uGHuS/z5jhQARAQAB
tFpBZXJvc3Bpa2UgKEtleSBQYWlyIGZvciBzaWduaW5nIGFydGlmYWN0IHNpZ25h
dHVyZSBzaWRlY2Fycy4pIDxhcnRpZmFjdC1zaWdAYWVyb3NwaWtlLmNvbT6JAjcE
EwEIACEFAmGmrUsCGwMFCwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQ8OF3BVq9
EJnpzxAAy16wGTy3Le6xRsl4hd+K5o48TupPzx6rTWHyQ2qUCPbUMZb8ItjJabN5
3R+cZrJIFuynYniq2JNGJ9XiJ6EKX7wqMR40Z/0P1Kqc1JqyosaXP+G53j1KnriR
0z3EFApmSJunH84oS9tpdqx7bxBlahpBPcZNmREam2Q5tDCHVH1bG8urRYe7jz1o
893jZl/RMD2CVSsLtZvBl7QQimoIyb8N4+bh2aIG2L5XVMZWuIvKp7thJinrtOt8
QCZ9HYdqEJzAissgp57jzVzcn5LxxzdG4tQ+rPIIqOa5TtIrBNtmJdaC7Z7FF9Nt
ls+9zhBDb1IvkqnpcQVhJTtzelDDo+ZYNMXY+fZ9dzhL1q9SYW2Cm+FMPv3EMy0a
+v/zX5AV/pb1uPSzKEKzHzKxK/20T+9TtNTpyylKKCTcGMSirE1c0c6K4TUD9JZ4
T+/sSTKm8dRFpX0u8wD2RmoV4ZQyEb1tP9UCBJscqIjZpfsvMp/L+4NuteqTJ6Ak
xV1EQB6VzubfLITx1oecm5jLNPymbVTrHNoJ4O0fyS1vLnaEpVW8O4uyyQ91yzx/
LihBXY66McEd6AKo+HPR/pczFUeeIn38xRMzgTfUAu5F7ElZfolHjJjCMj3t4E9G
fdCQdG+SCsPzNxlwLQDkvQlsmc4XTm+xmN9+T1vYOZwhBFFeX7M=
=V6iT
-----END PGP PUBLIC KEY BLOCK-----
How do I validate an Aerospike Database package?
Here is the current list of GPG Signed packages.
Product | Signed |
---|---|
Enterprise and Community Database | 5.7.0.30+, 6.0.0.14+, 6.1.0.12+, 6.2.0.6+, 6.3.0.0+, 6.4.0.0+, 7.0.0.0+ |
Federal Database | 6.0.0.13+, 6.1.0.11+, 6.2.0.6+, 6.3.0.0+, 6.4.0.0+, 7.0.0.0+ |
Tools | 8.2.0+ |
Prometheus Exporter | 1.10.0 |
Aerospike Shared-Memory Tool | 1.2.2+ |
The following steps validate the Database package. You can use the same steps to validate other Aerospike packages.
Verify checksums
Aerospike software packages are checksummed using the SHA-256 cryptographic hash function. You can verify the checksum using
shasum
,openssl
, orcertutil
, for example:shasum --check aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256
aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz: OKUsing openssl:
openssl dgst -sha256 aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz
SHA256(aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz)= 3a1ea17a531e1cd8c4fc4838516164463b4b0ab67325b60ae76efd41a4b04797
# check if this matches the SHA256 checksum file
cat aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256
3a1ea17a531e1cd8c4fc4838516164463b4b0ab67325b60ae76efd41a4b04797 aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgzOn Windows:
certutil -hashfile aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz sha256
Verify signatures
Import the Aerospike public key that you previously saved.
gpg --import aerospike_public_key.asc
Verify the signature of the checksum file using the public key.
gpg --verify aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256.asc aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256
gpg: Signature made Mon Mar 21 18:32:44 2022 PDT using RSA key F0E177055ABD1099
gpg: Good signature from "Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature! There is no indication that the signature belongs to the owner.
Primary key fingerprint: 0504 6BC9 2786 D0DC FA11 F519 F0E1 7705 5ABD 1099Authorize the key
The warning is expected since you have not been certified with the public key that Aerospike provided. To certify the key, perform the following:
gpg --list-keys
---------------------------------
pub rsa4096 2021-11-30 [SC]
05046BC92786D0DCFA11F519F0E177055ABD1099
uid [ unknown] Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>Edit the keys
gpg --edit-key 05046BC92786D0DCFA11F519F0E177055ABD1099
gpg (GnuPG) 2.3.4; Copyright (C) 2021 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
pub rsa4096/F0E177055ABD1099
created: 2021-11-30 expires: never usage: SC
trust: unknown validity: unknown
[ unknown] (1). Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>
gpg> trust
pub rsa4096/F0E177055ABD1099
created: 2021-11-30 expires: never usage: SC
trust: unknown validity: unknown
[ unknown] (1). Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>
Decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)
1 = I don't know or won't say
2 = I do NOT trust
3 = I trust marginally
4 = I trust fully
5 = I trust ultimately
m = back to the main menu
Your decision? 5
pub rsa4096/F0E177055ABD1099
created: 2021-11-30 expires: never usage: SC
trust: ultimate validity: ultimate
[ultimate] (1). Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>
gpg> saveVerify the key again to make sure the key is certified
gpg --verify aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256.asc aerospike-server-enterprise-5.7.0.12-ubuntu20.04.tgz.sha256
gpg: Signature made Mon Mar 21 18:32:44 2022 PDT using RSA key F0E177055ABD1099
gpg: Good signature from "Aerospike (Key Pair for signing artifact signature sidecars.) <artifact-sig@aerospike.com>" [ultimate]Now, you received a valid signature from Aerospike.
Verify signature on RPM package (Redhat base Linux)
Aerospike uses GPG to sign server and tools binaries. Use these steps to verify the signature of the downloaded binaries.
Run:
sudo rpm --import aerospike_public_key.asc
rpm --checksig <*.rpm>
Expected output:
digests signatures OK
Example:
$ rpm --checksig aerospike-server-enterprise-6.4.0.1-1.el8.x86_64.rpm
aerospike-server-enterprise-6.4.0.1-1.el8.x86_64.rpm: digests signatures OK
Verify signature on DEB package (Debian, ubuntu linux)
Run:
dpkg-sig --verify <*.deb>
Expect output:
GOODSIG _gpgbuilder
Example:
$ dpkg-sig --verify aerospike-server-enterprise_6.4.0.1-1debian12_arm64.deb
Processing aerospike-server-enterprise_6.4.0.1-1debian12_arm64.deb...
GOODSIG _gpgbuilder 05046BC92786D0DCFA11F519F0E177055ABD1099 1692152324
Data Migration and Synchronization
How long does a migration take?
Migrations vary depending on factors such as available network bandwidth, drive speed and the amount of data on each Aerospike cluster node. The rate of migration can be configured, as discussed in What is Migration?.
If I am synchronizing 2 datacenters using the Aerospike XDR product (available in the Enterprise Edition), what happens if the network connection is severed? Will I lose data or will it be stored somewhere?
You will not lose data. Aerospike XDR does not store a record's data for shipment to a remote destination. It records digests and Last Update Times(LUTs). It also persists records Last Ship Time (LST). In case of recovery from a failure, such as a network outage, XDR uses the LST to determine in a record needs to be shipped.
If I am synchronizing two data centers using Aerospike Enterprise Edition (EE) Cross Datacenter Replication (XDR), what happens if the network connection is severed?
You will not lose data. XDR continues operating as soon as the connection is reestablished. It may switch to a recovery process that compares the record last update time (LUT) with the partition last ship time (LST) to the remote destination, to determine if the record needs to be shipped. After the system catches up, XDR continues operating normally.
Definitions
What is Citrusleaf?
Citrusleaf is the former name of Aerospike. It was changed to Aerospike in the summer of 2012, but there are many labels that still refer to citrusleaf (cl).
What is the definition of [Bin, Cluster, Namespace, Record, Set, Digest]?
See the Glossary.
What is a Distributed Hash Table or DHT
The Distributed Hash Table is the accumulated information on where data is stored within the cluster. It includes the indexes that are distributed throughout the cluster and how they map to partitions (see “partition” and “partition map” below) and nodes.
What is the Generation Number?
When a record is written to a cluster, part of the metadata associated with the record is the generation. It is incremented each time the record is altered. The generation resolves any conflicts that may occur if two records have different values.
What is Replication Factor?
Every Aerospike namespace is configured with a replication factor that determines the number of copies of the data. The number of copies is referred to as the “replication factor.” A replication factor of 1
means that the data is not replicated and does not have a hot backup. Most Aerospike customers use a replication factor of 2
.
What is the range of values allowed for the replication factor?
The minimum replication factor is 1 (no replication). The upper limit is the number of nodes (a copy of the data in each node).
What is a Partition?
“Partitions” are buckets of records that have been grouped together for the data distribution purpose.
In order to evenly distribute data between the Aerospike cluster nodes, all data is mapped to one of 4096 partitions, based on the digest of a RIPEMD-160 based hashing algorithm. In turn, each partition is mapped to a node in the cluster based on an agreed-upon partition map. Whenever the number of nodes in the cluster changes, the partitions get remapped and transferred to the appropriate node in a process called “migration.”
What is Migration?
During normal operation, an Aerospike cluster has a number of copies of the data, as defined by the replication factor. This data is evenly distributed across the cluster in the master and replica partitions. If a node goes down, some portion of the data is no longer at the full replication factor. The Aerospike cluster responds by automatically promoting replica partitions with one fewer copy to master partitions. It is also creates empty replica partitions on other cluster nodes. It re-establishes a full replication factor by copying data between nodes to fill empty replicas. This data motion is referred to as a “fill migration.” A different migration process focuses on rebalancing the data by moving partitions around the newly-formed cluster.
There are configuration settings for how many threads to use for migration, how many partitions should be migrating in and out of the node at the same time, as well as, a setting delay fill migrations in the event of a planned rolling upgrade.
What is a Set?
An Aerospike “set” is similar to a table in a relational database. One of the big differences is that with Aerospike, you do not need to predefine a schema. You may add bins (or columns) to one record in a set without needing to add them to any other record in the set.
What is CITRUSLEAF_EPOCH Time?
Aerospike compacts dates by subtracting out the CITRUSLEAF_EPOCH time. The epoch is taken as the second before 12:00:01 am, January 1, 2010 GMT. Time in the Citrusleaf epoch can be calculated by the following formula:
Current time - 1262304000 = Time in CITRUSLEAF_EPOCH
#define CITRUSLEAF_EPOCH 1262304000
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
return ( ts.tv_sec - CITRUSLEAF_EPOCH );
What does the expiration date of a record returned from a query callback mean?
The expiration date field returned from a query is the actual expiry time (Citrusleaf epoch time) for that record, which is computed from the record TTL value and the time the record was written to the database. It is not mandatory for a record to have a TTL.
It is possible to configure a default TTL for a given namespace, any TTL set by a client supersedes the namespace default.
What is a write-master?
When data is written to the cluster, it is first be written to the master node for that record. This is referred to as a “write-master.” Statistics on these writes are available and are different than writes to a node for a replica copy. These replica writes are known as a “write-prole” (see “write-prole” below). Write-masters are used to determine how many unique records have been written to the cluster.
What is the difference between a write-master and a write-prole?
After a record has been written to the master for that record (see “write-master” above), there are subsequent writes to the nodes that have the replica. These are known as “write-proles”. If the replication factor has been set at "1", there are no write-proles.