Defragmentation
Aerospike can absorb large amounts of data in a short time. The following examples describe options for configuring Aerospike to manage storage, without requiring continual intervention by administrators.
Defragmentation
Aerospike runs a continuous background defragmentation process to maximize the amount of available storage. When write-block (wblock) usage drops below the defrag-lwm-pct
limit, storage space occupied by stale data is reclaimed.
Write-blocks that are still in the post-write-queue
(before Database 7.1) or post-write-cache
(the name in Database 7.1 and later) are not candidates for defragmentation, even if the percent of live records in those write-blocks drops below the defrag-lwm-pct
. Therefore, the post-write-cache
(or post-write-queue
) should be kept small compared to the overall device size as the size allocated to the post-write-cache
will not be defragmented.
Aerospike requires free storage space in order to efficiently defragment the storage device while also performing a high volume of commands at low latency.
When defragmentation cannot keep up with storage requirements, you may have to increase the defragmentation rate.
You can use asadm
to check storage statistics.
- In Database 7.0 the metrics
data_avail_pct
,data_used_pct
are common to all storage engines. - Prior to Database 7.0 metrics included
device_available_pct
,pmem_available_pct
,device_free_pct
, andpmem_free_pct
.
The following command shows the device_available_pct
for the test
namespace in Aerospike 6.x:
asadm --enable -e "show statistics like device_available_pct for test"
The Aerospike defragmentation mechanism
Aerospike writes data to namespace storage-engine
in blocks.
-
In Database 7.0 and earlier, the size is configured in the
write-block-size
parameter. Each wblock is filled with incoming write transactions and then flushed to a persistent storage device. -
In default conditions, blocks are sent for defragmentation when they are 50% occupied or less. As Aerospike reaches the end of these blocks, they are immediately sent for defragmentation unless a post write queue is configured.
As records are updated or deleted, the active records capacity of the wblocks decreases. When a block usage level falls below the value set by the defrag-lwm-pct
parameter, it becomes eligible for defragmentation and is queued up in the storage-engine.device[ix].defrag_q
. The default value of defrag-lwm-pct
is 50%.
The following four configuration parameters can be tuned for the defragmentation sub-system. You can set them dynamically, or in the aerospike.conf
server configuration file for a persistent configuration:
-
defrag-lwm-pct
(default: 50%). A higher percentage means more blocks are scheduled to be reclaimed, and more dense data on the device. The default value provides a good balance between space usage and write amplification.- For a given use case it may be desirable to increase
defrag-lwm-pct
and gain more usable space on the storage devices. In such instances, for example when the workload is read-heavy, write-amplification may be less of a factor. This should be tested, particularly to observe the effect on defragmentation load during commands which generate a lot of deletions, such as truncation or partitions dropping during migration. - In Database 7.0, for an in-memory namespace without storage-backed persistence you can similarly tune the
defrag-lwm-pct
higher, but here the trade-off is between space usage and CPU consumption. This should be adjusted carefully and observed.
- For a given use case it may be desirable to increase
-
defrag-sleep
: The default sleep time is 1000 microseconds after each wblock is defragmented. -
defrag-startup-minimum
defaults to 10%. If a minimum of 10% of data storage is not writable then the server will not join the cluster or open a service port.
defrag-queue-min
: The default is 0, do not defragment. Use a value greater than zero to define how many wblocks in the defrag-queue will initiate defragmentation.
The server log captures the defragmentation profile:
NAMESPACE-NAME /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)
The details for each parameter are described in the log reference manual. The following metrics capture device statistics:
storage-engine.device[ix].used_bytes
storage-engine.device[ix].free_wblocks
storage-engine.device[ix].write_q
storage-engine.device[ix].writes
storage-engine.device[ix].defrag_q
storage-engine.device[ix].defrag_reads
storage-engine.device[ix].defrag_writes
storage-engine.device[ix].shadow_write_q
In the example log line, the writes per second are greater than the defragmentation writes. Writes per second include the defrag writes per second. Initially, this may not pose a problem but over a period of time, you may be running low on available wblocks. You may also want to monitor the defrag-q
, which should not be constantly increasing. If you determine the node is falling behind and the logs show an empty defragmentation queue, consider raising the defrag-lwm-pct
slightly. Be aware that raising the defrag-lwm-pct
will have a non-linear write amplification.
Search for write
and defrag-write
in your server logs to see more useful information:
tail -f /var/log/aerospike/aerospike.log | grep -ie write -e defrag-write /var/log/aerospike/aerospike.log
Increasing the defragmentation rate
You may need to temporarily decrease the defrag-sleep
and increase the defrag-lwm-pct
parameters.
Use the asadm
command-line interface to change defrag-sleep
:
Admin> enableAdmin+> manage config namespace TEST storage-engine param defrag-sleep to 500 with 10.0.0.1:3000
Expected output:
~Set Namespace Param defrag-sleep to 500~ Node|Response10.0.0.1:3000|okNumber of rows: 1
Change defrag-sleep
:
Admin+> manage config namespace TEST storage-engine param defrag-lwm-pct to 60 with 10.0.0.1:3000
Expected output:
~Set Namespace Param defrag-lwm-pct to 60~ Node|Response10.0.0.1:3000|okNumber of rows: 1
The new values will not persist after a server restart. Add your desired values to aerospike.conf
, in the namespace storage-engine section, to make them persistent:
defrag-sleep 500defrag-lwm-pct 60
Nodes will not start if there is not enough storage
If the database does not have enough contiguous storage to start, and does not have enough space to defragment to get the space it needs, it will not start.
For persistence files for in-memory databases, specify the size of the persistence file (in contrast to using an SSD, where you use the entire SSD). The persistence file size can also run out of space and the same rules apply as for SSDs.