Defragmentation

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

Aerospike is designed to append writes sequentially into empty, fixed-size write blocks. Over time, updates and deletes cause some blocks to contain stale records. Defragmentation compacts those blocks by copying the live records out into a fresh block, so the old block can be reused for fast, sequential appends. Depending on the insert and update patterns of your application, you can adjust performance with the defragmentation settings.

Aerospike writes data to namespace storage-engine in write-blocks, also known as wblocks.

In Database 7.0.0 and earlier, the size is configured in the write-block-size parameter. Each wblock is filled with incoming write transactions and then flushed to a persistent storage device.
In default conditions, blocks are sent for defragmentation when they are 50% occupied or less. As Aerospike reaches the end of these blocks, they are immediately sent for defragmentation unless a post write queue is configured.

As records are updated or deleted, the active records capacity of the wblocks decreases. When a block usage level falls below the value set by the defrag-lwm-pct parameter, it becomes eligible for defragmentation and is queued up in the storage-engine.device[ix].defrag_q. The default value of defrag-lwm-pct is 50%.

Simulation

In this section, you can launch an interactive demo of the defragmentation process. This visualization shows ~800 Insert writes to storage blocks, then switches to Update mode.

Each block is a grid of tiny cells, each representing one record. New records are green and updated records are red.

After the switch from Insert mode to Update mode, some green cells turn red, representing updates to existing records. This lowers a block’s live ratio, or the ratio of new records to updated records. When a block’s live ratio drops below the low water mark (LWM), it gets queued for defrag and marked with a blue border. The defrag process drains live green records from the blue queue block into the yellow destination block.

When a source block is fully drained, it becomes empty and eventually re-enters the free queue. If a destination fills during migration, a new destination is reserved immediately and the migration continues.

Insert

Update

■New ■ Updated Current App SWB Eligible for defrag Current Defrag SWB Pristine*

Writes/s 200

defrag-lwm-pct 50%

App writes 0

Defrag writes 0

Total records 0

*Pristine refers to completely unwritten blocks on the disk. When writing records, Aerospike reuses blocks and only writes to a pristine block when necessary.

Tune defragmentation parameters

You can set the following four configuration parameters dynamically, or in the aerospike.conf configuration file for a persistent configuration. Always test changes before deploying to production.

defrag-lwm-pct (default: 50%). A higher percentage means more blocks are scheduled to be reclaimed, and more dense data on the device. The default value provides a good balance between space usage and write amplification.
- You can increase defrag-lwm-pct and gain more usable space on the storage devices, depending on your deployment details. When the workload is read-heavy, write amplification may be less of a factor.
- In Database 7.0.0, for an in-memory namespace without storage-backed persistence, you can tune the defrag-lwm-pct higher, but here the trade-off is between space usage and CPU consumption.
defrag-sleep: The default sleep time is 1000 microseconds after each wblock is defragmented.
defrag-startup-minimum defaults to 10%. If a minimum of 10% of data storage is not writable, the server will not join the cluster or open a service port.

defrag-queue-min: The default is 0, do not defragment. Use a value greater than zero to define how many wblocks in the defrag-queue will initiate defragmentation.

When write-block usage drops below the defrag-lwm-pct limit, storage space occupied by stale data is reclaimed.

Write-blocks that are still in the post-write-queue (before Database 7.1.0) or post-write-cache (the name in Database 7.1.0 and later) are not candidates for defragmentation, even if the percent of live records in those write-blocks drops below the defrag-lwm-pct. Therefore, the post-write-cache (or post-write-queue) should be kept small compared to the overall device size as the size allocated to the post-write-cache will not be defragmented.

If avail-pct (tracked as data_avail_pct in Database 7.0.0 and later, or device_available_pct / pmem_available_pct in earlier versions) is below the defrag-startup-minimum threshold and defragmentation cannot raise it, the database will not start.

For persistence files for in-memory databases, specify the size of the persistence file (in contrast to using an SSD, where you use the entire SSD). The persistence file size can also run out of space and the same rules apply as for SSDs.

Increase the defragmentation rate

Aerospike requires free storage space in order to efficiently defragment the storage device while also performing a high volume of commands at low latency.

When defragmentation cannot keep up with storage requirements, you may have to increase the defragmentation rate.

You can use asadm to check storage statistics.

In Database 7.0.0 the metrics data_avail_pct, data_used_pct are common to all storage engines.
Prior to Database 7.0.0 metrics included device_available_pct, pmem_available_pct, device_free_pct, and pmem_free_pct.

The following command shows the device_available_pct for the test namespace in Aerospike 6.x.y.z:

asadm --enable -e "show statistics like device_available_pct for test"

You may need to temporarily decrease the defrag-sleep and increase the defrag-lwm-pct parameters.

Use the asadm command-line interface to change defrag-sleep:

Admin> enable
Admin+> manage config namespace TEST storage-engine param defrag-sleep to 500 with 10.0.0.1:3000

Expected output:

~Set Namespace Param defrag-sleep to 500~
         Node|Response
10.0.0.1:3000|ok
Number of rows: 1

Change defrag-sleep:

Admin+> manage config namespace TEST storage-engine param defrag-lwm-pct to 60 with 10.0.0.1:3000

Expected output:

~Set Namespace Param defrag-lwm-pct to 60~
         Node|Response
10.0.0.1:3000|ok
Number of rows: 1

The new values will not persist after a server restart. Add your desired values to aerospike.conf, in the namespace storage-engine section, to make them persistent:

defrag-sleep 500
defrag-lwm-pct 60

Defragmentation logs

The server log captures the defragmentation profile:

NAMESPACE-NAME /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)

The details for each parameter are described in the log reference manual. The following metrics capture device statistics:

In the example log line, the writes per second are greater than the defragmentation writes. Writes per second include the defrag writes per second. Initially, this may not pose a problem but over a period of time, you may be running low on available wblocks. You may also want to monitor the defrag-q, which should not be constantly increasing. If you determine the node is falling behind and the logs show an empty defragmentation queue, consider raising the defrag-lwm-pct slightly. Be aware that raising the defrag-lwm-pct will have a non-linear write amplification.

Search for write and defrag-write in your server logs to see more useful information:

tail -f /var/log/aerospike/aerospike.log | grep -ie write -e defrag-write /var/log/aerospike/aerospike.log