Configuring namespace data retention
This page describes how to configure Aerospike namespaces to ensure sufficient memory for continuous operation.
Overview
Configure the following to ensure sufficient memory for continuous operation:
- Configure the Namespace Supervisor (NSUP) to expire or evict records.
- Configure a stop-writes mechanism to stop new client writes from filling namespace storage beyond a designated threshold.
Eviction, expiration, void-time and time-to-live
The Namespace Supervisor (NSUP) is the subsystem that removes expired records from the primary index. Any past generation version of a record will subsequently get cleaned up by the continuous defragmentation process. Records with a void-time of 0 do not expire or get evicted.
Aerospike records include void-time metadata - the timestamp in the future when they automatically expire. As far as applications are concerned, an expired record does not exist, but its metadata still occupies a 64 byte slot in the primary index.
Independent of expiration, if namespace storage exceeds a configurable high-water mark, NSUP deletes the records with non-zero void-time that are nearest to their expiration. This early expiration process is called eviction. Eviction continues until sufficient space has been recovered.
If NSUP is not running, expired records are not removed unless they're manually deleted or replaced.
Applications optionally send a time-to-live (TTL) with writes, which declares the number of seconds from now until the record expires. By default, namespaces reject writes with a TTL, and NSUP does not run, but this behavior is configurable (see below). If client writes are allowed to send a TTL, the record's void-time is set according to client TTL values. In this case, you should also configure NSUP to check for expired records.
Starting with Aerospike Database 7.1, read operations may extend the TTL of a record, which is within a specified percent from its void-time. This least-recently used (LRU) eviction behavior is controlled by the namespace or set-level configuration parameter default-read-touch-ttl-pct
. As with write operation TTLs, the client may override the server configuration and provide an explicit read-touch policy based on the percent to the record's void-time. For example, in the Java client 8.1.0 the Policy.readTouchTtlPercent
was added.
NSUP and TTL Configuration Parameters
The following parameters are configurable per namespace, and focus on NSUP and TTL.
Configuration parameter | Description | Default value | Notes |
---|---|---|---|
nsup-period | The time interval in seconds between successive starts of NSUP scans of the primary index for expired records. If NSUP takes longer to traverse the primary index than the nsup-period , it will effectively be running continuously. | 0 | By default, NSUP does not run ( nsup-period of 0).Once NSUP is running, it does not stop until it has traversed the entire primary index of the namespace. When nsup-period is 0, to allow writes that have a positive integer TTL, you must set allow-ttl-without-nsup to true (see below).Regardless of the setting of nsup-period , writes with a non-positive TTL (<= 0) are always allowed. |
allow-ttl-without-nsup | A parameter for testing only. Measures the impact of NSUP when running in a use case where TTL is non-0. Allows records to be written to the namespace with positive integer TTLs, even if NSUP is disabled. | false | Warning: Records that have a TTL when NSUP is not running. |
default-ttl | The default TTL value to use for the namespace, whenever a client writes a record with a TTL of 0. | 0 | If the value of default-ttl is non-0 and nsup-period is 0 (its default), the Aerospike server will not start.A default-ttl of 0 sets a void-time of 0. Aerospike never expires or evicts records that have a void-time of 0.default-ttl cannot be set higher than 10 years (3650D). |
default-read-touch-ttl-pct | For the namespace configuration, 0 means that read operations never touch a record. | 0 | For the namespace configuration, 0 means read operations never extend a record's TTL. Values 1-100 specify a percentage of the most recent record expiration time, so that a read within this interval of the record’s end of life will generate a touch. The touch uses the previous record TTL to extend the record’s life. For the set configuration, 0 means use the namespace value. A set-level configuration can explicitly override the default namespace value: -1 means reads never touch a record. Values 1-100 are the same as the namespace configuration. Clients may also send a read-touch TTL percent: 0 instructs the server to use its configuration. Other values override the server configuration. -1 means reads never touch a record, values 1-100 are the same as the server configurations. |
The following example shows additional namespace data retention parameters to tune data expiration and eviction, with comments describing usage.
namespace NSNAME {
nsup-period 600 # Maximum time between starting successive
# rounds of expiration or eviction - a value
# of 0 disables expiration and eviction.
nsup-threads 2 # How many threads per round of expiration or eviction
evict-tenths-pct 5 # Fraction of evictable records to delete
# per round of eviction. For example, 5 means
# delete 0.5 percent of evictable records).
}
Client TTL values
Write operations
When a namespace is configured to allow writes with a TTL, a client may send a positive TTL value, which determines how many seconds the record has until it expires. The void-time is set to the timestamp of now plus the TTL.
Additionally, there are three special TTL values that can always be used, regardless of namespace configuration:
- A TTL of 0 instructs the server to use the
default-ttl
of the namespace when setting the void-time. - A TTL of -1 sets the record's void-time to 0, which means that the record will not expire.
- A TTL of -2 instructs the server not to modify the void-time if the write is an update operation. If the write creates a new record, the
default-ttl
determines the void-time.
A durable delete from the client will create a tombstone, which always has a void-time of 0, regardless of the record's previous void-time.
Writing a record with a TTL value less than the current remaining life will reduce the record's void-time to less than its current void-time. This may have undesirable side effects upon a cold restart. For further details, see Issues with cold-start resurrecting deleted records.
Read operations
Aerospike Database 7.1 introduced an LRU eviction behavior. Client versions implementing this functionality add the ability to control if reads extend record void-time, regardless of namespace configuration, with the Policy.readTouchTtlPercent
:
- A value of 0 instructs the server to use the
default-read-touch-ttl-pct
of the namespace or set. - A value of -1 states that this read operation will never modify the record's TTL.
- A value of 1-100 describes that this read should also touch the record (extending its TTL) if the record's void-time is within this percentage.
Warning: records that have a TTL when NSUP is not running
The following side-effects occur when NSUP is not running:
- Trying to write a record with a TTL gets rejected with error code 22 (
AS_ERR_FORBIDDEN
). - Expired records do not get removed from the indexes and storage.
- An attempt to read an expired record returns error code 2 (
AS_ERR_NOT_FOUND
). - An attempt to update an expired record creates a new record with its metadata in the same primary index slot. The previous version of the record is ignored, and the generation count is reset to 1.
- Deleting an expired record removes it from the index. The defragmentation process will reclaim its storage.
- An attempt to read an expired record returns error code 2 (
Ensuring that NSUP is keeping up
If NSUP is not able to keep up with a node's expiration/eviction of records, it might take the node a long time to restart, as the node will first remove expired records before rejoining the cluster. Also, if a large percentage of records are removed at startup the server has to deal with a temporary large increase in its defragmentation load. As of Database 6.3, if the NSUP cycle takes longer than 2 hours and deletes more than 1% of the namespace, a warning line is written to the server log. We recommend that you tune NSUP period and nsup-threads
to keep up with expirations if you see this warning.
Eviction and stop-writes configuration parameters
Database 7.0 and later
Starting with Database 7.0 the configuration parameters mentioned below apply to any storage-engine, which share the same write-block based storage format.
Configuration parameter | Description | Default value | Notes |
---|---|---|---|
evict-indexes-memory-pct | Eviction threshold defined as the percentage of the indexes-memory-budget , which is the stop-writes threshold for the namespace, based on total memory used for indexes (primary, secondary and set indexes). | 0 | The default value of 0 disables this threshold. |
evict-used-pct | Eviction threshold defined as the ratio of used storage (data_used_bytes ) to the total namespace storage capacity (data_total_bytes ). | 0 | The default value of 0 disables this threshold. |
evict-mounts-pct | Eviction threshold defined as the ratio of index mount utilization (index_mounts_used_pct or sindex_mounts_used_pct ) to the namespace mounts-size-limit . | 0 | Only applies when the namespace primary or secondary indexes are configured to be stored in flash or persistent memory. |
indexes-memory-budget | The maximum memory budget (in bytes) used by the namespace for indexing (primary, secondary and set indexes). Acts as a stop-writes threshold. | 0 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
stop-writes-sys-memory-pct | Percentage threshold at which client writes are refused, defined as the ratio of total memory usage (across all applications) to the system memory. | 90 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
stop-writes-avail-pct | Stops client writes when the namespace storage engine has its reserve of write-blocks drop under a minimum, defined as the ratio of free write-blocks to the storage engine capacity. | 5 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
stop-writes-used-pct | Stops client writes when the ratio of used storage space to total storage space (in bytes) exceeds the given max percentage. | 70 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
The following configuration file snippet shows an example of namespace data retention with TTLs used and eviction enabled.
namespace NSNAME {
stop-writes-sys-memory-pct 90 # Stop-writes threshold based on memory usage
# across the host machine
storage-engine device {
device /dev/nvme0n1p1
device /dev/nvme0n1p2
stop-writes-avail-pct 5 # stop-writes threshold as a percentage
# of the total device size.
stop-writes-used-pct 70 # stop-writes threshold as a percentage
# of the total device size.
evict-used-pct 60 # eviction threshold as a percentage
# of the total device size.
}
index-type flash { # Primary index on flash (AKA All Flash)
mounts-budget 64G
evict-mounts-pct 80 # eviction threshold based on the primary index
# mounts budget
}
}
Prior to Database 7.0
Configuration parameter | Description | Default value | Notes |
---|---|---|---|
high-water-disk-pct | Percentage threshold at which the eviction process starts, defined as the ratio of namespace disk consumption to its device storage capacity. | 0 | Default value of 0 disables the threshold. |
high-water-memory-pct | Percentage threshold at which the eviction process starts, defined as the ratio of namespace memory consumption to its memory-size . | 0 | The default value of 0 disables the threshold. |
mounts-high-water-pct | Percentage threshold at which the eviction process starts, defined as the ratio of index mount utilization (index_flash_used_pct or index_pmem_used_pct ) to the namespace mounts-size-limit . | 0 | Only applies when the namespace primary index is configured to be stored in flash or persistent memory. |
stop-writes-pct | Percentage threshold at which client writes are refused, defined as the ratio of namespace memory consumption to its memory-size . | 90 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
stop-writes-sys-memory-pct | Percentage threshold at which client writes are refused, defined as the ratio of total memory usage (across all applications) to the system memory. | 90 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
min-avail-pct | Stops client writes when any namespace storage device (SSD or PMem) has its reserve of write blocks drop under a minimum, defined as the ratio of free write blocks to the device storage capacity. | 5 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
max-used-pct | Stops client writes when the ratio of used storage space to total storage space (in bytes) exceeds the given max percentage. | 70 | Deletions, replica writes, and migration writes are still allowed when the namespace is in stop-writes mode. |
The following configuration file snippet shows an example of namespace data retention with TTLs used and eviction enabled.
namespace NSNAME {
stop-writes-sys-memory-pct 90 # Stop-writes threshold based on memory usage
# across the host machine
memory-size 256G
stop-writes-pct 90 # Stop-writes threshold based on namespace memory-size
storage-engine device {
device /dev/nvme0n1p1
device /dev/nvme0n1p2
min-avail-pct 5 # Stop-writes threshold as a percentage
# of the total device size.
max-used-pct 70 # stop-writes threshold as a percentage
# of the total device size.
}
high-water-disk-pct 60 # Eviction threshold based on namespace device-size
high-water-memory-pct 70 # Eviction threshold based on namespace memory-size
index-type flash { # Primary index on flash (AKA All Flash)
mounts-size-limit 64G
mounts-high-water-pct 80 # Eviction threshold based on mounts-size-limit
}
}
Static set configurations
Disabling eviction on sets
You can protect a set from evictions using the disable-eviction true
configuration parameter.
namespace NSNAME {
set SETNAME {
disable-eviction true
}
}
Read further on dynamically disabling set evictions.
Defining a data size cap on a set
You can define a stop-writes-size
to limit the amount of storage it can occupy.
namespace <namespace-name> {
set <set-name> {
stop-writes-size 500M # Limit this set's storage to 500MB
}
}
See how to dynamically configure a set size cap.
Defining an object count limit on a set
A set can have stop-writes-count
to limit the number of records that can be written to it.
namespace <namespace-name> {
set <set-name> {
stop-writes-count 5000 # Limit the number of records that can
# be written to this set to 5000.
}
}
See how to dynamically configure an object-count limit on a set.
prior to Aerospike Database 5.6 this configuration parameter was called set-stop-writes-count
.
Specifying a set-level default TTL
If you specify the default-ttl
configuration option
at the set level, it overrides any default-ttl
option specified at the namespace level.
set test-set {
default-ttl 60D
}
See how to dynamically configure a default TTL for a set.
Specifying a set-level LRU eviction behavior
If you specify the default-read-touch-ttl-pct
configuration option
at the set level, it overrides any default-read-touch-ttl-pct
option specified at the namespace level.
set test-set {
default-read-touch-ttl-pct 1D
}
How to list non-expirable records
Use the asadm
command to determine the number of non-expirable records:
show stat like non_expirable_objects
To find all those records, create a backup and grep
for the pattern ^+ t 0
in the backup files. Refer to asbackup command-line options
and Backup File Format
for more information.
Alternatively, write a user-defined function (UDF)
to scan records based on the record.ttl
field. Note than this could turn into an intensive operation that may affect a production system’s performance. How to Modify TTL using UDF
provides some examples.
Where to Next?
- Configure Storage Engine which determines if and where records are persisted to.
- Configure Data Durability Policy which determines how many replica copies of a record to keep in the cluster.
- Or return to Configure Page.