Metrics Reference
See the Metrics command examples for information on usage.
Namespace
aerospike_namespace_appeals_records_exonerated  Number of records that were marked replicated as result of an appeal. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.
counter  integer  aerospike_namespace_appeals_rx_active  Number of partition appeals currently being received. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.
gauge  integer  aerospike_namespace_appeals_tx_active  Number of partition appeals currently being sent. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.
gauge  integer  aerospike_namespace_appeals_tx_remaining  Number of partition appeals not yet sent. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster. Appeals occur after a node has been cold-started. The replication state of each record is lost on cold-start and all records must assume an unreplicated state. An appeal resolves replication state from the partition’s acting master. These are important for performance; an unreplicated record will need to re-replicate to be read which adds latency. During a rolling cold-restart, an operator may want to wait for the appeal phase to complete after each restart to minimize the performance impact of the procedure.
gauge  integer  aerospike_namespace_auto_revived_partitions  Number of partitions that the auto-revive feature revived at startup.
gauge  integer  aerospike_namespace_available_bin_names  Remaining number of unique bins that the user can create for this namespace. 
  The formula for the associated metrics is as follows:
 bin_names_quota - bin_names = available_bin_names
gauge  integer  aerospike_namespace_batch_sub_delete_error  Number of batch-index delete sub-batches that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.
counter  integer  aerospike_namespace_batch_sub_delete_filtered_out  Number of batch-index delete sub-batches that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_batch_sub_delete_not_found  Number of batch-index delete sub-batches that resulted in not found.
counter  integer  aerospike_namespace_batch_sub_delete_success  Number of records successfully deleted by batch-index sub-batches.
counter  integer  aerospike_namespace_batch_sub_delete_timeout  Number of batch-index delete sub-batches that timed out.
counter  integer  aerospike_namespace_batch_sub_lang_delete_success  Number of successful batch-index UDF delete sub-batches.
counter  integer  aerospike_namespace_batch_sub_lang_error  Number of language (Lua) batch-index errors for UDF sub-transactions.
counter  integer  aerospike_namespace_batch_sub_lang_read_success  Number of successful batch-index UDF read sub-batches.
counter  integer  aerospike_namespace_batch_sub_lang_write_success  Number of successful batch-index UDF write sub-batches.
counter  integer  aerospike_namespace_batch_sub_proxy_complete  Number of proxied batch-index sub-batches that completed.
counter  integer  aerospike_namespace_batch_sub_proxy_error  Number of proxied batch-index sub transactions that failed with an error.
counter  integer  aerospike_namespace_batch_sub_proxy_timeout  Number of proxied batch-index sub-batches that timed out.
counter  integer  aerospike_namespace_batch_sub_read_error  Number of batch-index read subtransaction that failed with an error. For example: invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.
counter  integer  aerospike_namespace_batch_sub_read_filtered_out  Number of batch-index read sub-batches that were skipped because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_batch_sub_read_not_found  Number of batch-index read subtransaction that resulted in not found.
counter  integer  aerospike_namespace_batch_sub_read_success  Number of records successfully read by batch-index sub-batches.
counter  integer  aerospike_namespace_batch_sub_read_timeout  Number of batch-index read sub-batches that timed out.
counter  integer  aerospike_namespace_batch_sub_tsvc_error  Number of batch-index sub-batches that failed with an error in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatches. In strong-consistency enabled namespaces, this includes transactions against unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes, and they are counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_batch_sub_tsvc_timeout  Number of batch-index sub-batches that timed out in the transaction service, before attempting to handle the transaction.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes, and they are counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_batch_sub_udf_complete  Number of completed batch-index UDF sub-batches for scan/query background UDF jobs. See the following statistics for the underlying operation statuses batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success .
counter  integer  aerospike_namespace_batch_sub_udf_error  Number of failed batch-index UDF sub-batches for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses: batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success.
counter  integer  aerospike_namespace_batch_sub_udf_filtered_out  Number of batch-index UDF sub-batches that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_batch_sub_udf_timeout  Number of batch-index UDF sub-batches that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success.
counter  integer  aerospike_namespace_batch_sub_write_error  Number of batch-index write sub-batches that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.
counter  integer  aerospike_namespace_batch_sub_write_filtered_out  Number of batch-index write sub-batches that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_batch_sub_write_success  Number of records successfully written by batch-index sub-batches.
counter  integer  aerospike_namespace_batch_sub_write_timeout  Number of batch-index write sub-batches that timed out.
counter  integer  aerospike_namespace_bin_names  Number of bin names used for the namespace. 
 The formula for the associated metrics is as follows:
 bin_names_quota - bin_names = available_bin_names
gauge  integer  aerospike_namespace_bin_names_quota  Quota of bin names for the namespace. Starting with Database 7.0, there is no limit on bin names per namespace. In Database 5.0 and 6.0, the limit was 65,535.
The formula for the associated metrics is as follows:
bin_names_quota - bin_names = available_bin_names
If you have met the quota, see KB article How to clear up bin names when they exceed the limits.
gauge  integer  aerospike_namespace_cache_read_pct  Percentage of read commands that are hitting the post-write-cache or the blocks in the max-write-cache and will save an IO to the underlying storage device.
See the post-write-cache and read-page-cache documentation for ways to improve read-intensive workloads latency by leveraging those 2 different caching options.
Reads from update commands as well as migrations, scans, XDR reads and anything that tries to load a record off the device are accounted for in the cache_read_pct figures.
gauge  integer  aerospike_namespace_client_delete_error  Number of client delete commands that failed with an error.
counter  integer  Compare client_delete_error to client_delete_success.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_client_delete_filtered_out  Number of client delete commands that did not happen because the record was filtered out with Filter Expression.
counter  integer  aerospike_namespace_client_delete_not_found  Number of client delete commands that resulted in a not found.
counter  integer  aerospike_namespace_client_delete_success  Number of successful client delete commands.
counter  integer  aerospike_namespace_client_delete_timeout  Number of client delete commands that timed out.
counter  integer  aerospike_namespace_client_lang_delete_success  Number of UDF commands that successfully deleted a record.
counter  integer  aerospike_namespace_client_lang_error  Number of UDF commands that failed with a language (Lua) error during UDF execution.
counter  integer  aerospike_namespace_client_lang_read_success  Number of successful record reads caused by a UDF command.
counter  integer  aerospike_namespace_client_lang_write_success  Number of successful record writes caused by a UDF command.
counter  integer  aerospike_namespace_client_proxy_complete  Number of client commands proxied to another node.
counter  integer  aerospike_namespace_client_proxy_error  Number of client commands that failed to proxy to another node.
counter  integer  aerospike_namespace_client_proxy_timeout  Number of client commands that timed out while being proxied to another node.
counter  integer  aerospike_namespace_client_read_error  Number of read commands that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.
counter  integer  Compare client_read_error to client_read_success.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_client_read_filtered_out  Number of read commands that did not happen because they were filtered out.
counter  integer  aerospike_namespace_client_read_not_found  Number of client read commands that resulted in not found.
counter  integer  aerospike_namespace_client_read_success  Number of successful client read commands. Does not include records read by batch-reads or scans. batch-reads have the separate batch_sub_read_success metric. Scans have separate metrics depending on the type of scan between scan_basic_complete, scan_aggr_complete, scan_ops_bg_complete, and scan_udf_bg_complete metrics.
counter  integer  aerospike_namespace_client_read_timeout  Number of client read commands that timed out.
counter  integer  aerospike_namespace_client_tsvc_error  Number of client commands that failed in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this includes commands against unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_client_tsvc_timeout  Number of client commands that timed out while in the transaction service, before attempting to handle the command. At this stage the commands has not yet been identified as a read or a write, but the namespace is known. Likely cause, there may not be enough service threads to keep pace with the workload. Other common situations falling into this category would be commands that have to be retried after waiting in the rw-hash (for example hotkeys) and use cases where the timeout set by the client is too aggressive.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_client_udf_complete  Number of completed UDF commands initiated by the client.
counter  integer  aerospike_namespace_client_udf_error  Number of failed UDF commands initiated by the client. Does not include timeouts. Error is also returned to the client.
counter  integer  Compare client_udf_error to client_udf_complete.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_client_udf_filtered_out  Number of client UDF commands that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_client_udf_timeout  Number of UDF commands initiated by the client that timed out. The timeout error is returned to the client.
counter  integer  aerospike_namespace_client_write_error  Number of client write commands that failed with an error. Includes common errors like fail_generation, fail_key_busy, fail_record_too_big, fail_xdr_forbidden and some less common errors. Includes xdr_client_write_error. See Why is my client_write_error metrics incrementing? for details on the type of errors that increment this statistic.
counter  integer  Compare client_write_error to client_write_success.
If ratio is higher than acceptable,alert operations to investigate.
For more details, see to the knowledge base article Why is my client_write_error metrics incrementing?.
aerospike_namespace_client_write_filtered_out  Number of client write commands that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_client_write_success  Number of successful client write commands. Includes xdr_client_write_success.
counter  integer  aerospike_namespace_client_write_timeout  Number of client write commands that timed out on the server. On a stable cluster with no migrations in progress, this metric indicates the number of replica write timeouts. A timeout error is returned to the client. In strong-consistency enabled namespaces, the record is marked as unreplicated and will re-replicate. Includes xdr_client_write_timeout.
counter  integer  The following conditions can cause this metric to increment:
- 
Every single write replica failure (master failing to replicate) increments the client_write_timeout metric. 
- 
If duplicate resolution is enabled for writes (default), during migrations, the client_write_timeoutmetric also increments if there is a timeout during duplicate resolution and could occur before we apply the write on the master side.
- 
See transaction-max-msfor details on when the server checks for timeout. Transactions can also timeout earlier in the transaction flow, in which case, theclient_tsvc_timeoutstatistic increments.
aerospike_namespace_clock_skew_stop_writes  Namespace will stop accepting client writes when true. 
 For strong-consistency enabled namespaces, will be true if the clock skew is outside of tolerance, typically 20 seconds.
For Available mode (AP) namespaces running Database 4.5.1 or later, and where NSUP is enabled (nsup-period not zero), will be true if the cluster clock skew exceeds 40 seconds. In such occurrences, NSUP will also not run, disabling record expirations and evictions until the clock skew falls back in the tolerated range.
gauge  boolean  If clock_skew_stop_writes is true, it is a critical ALERT.
Verify that clocks are synchronized across the cluster.
aerospike_namespace_current_time  Current time represented as Aerospike epoch time.
gauge  integer  If cluster_max(current_time) and cluster_min(current_time) differ by more than 10 seconds, critical ALERT.
 Server time skew might indicate that NTP or similar service is not running on this node.
aerospike_namespace_data_avail_pct  Measures the minimum contiguous storage-engine device, pmem, or memory storage file space across all such files in a namespace. The namespace is read-only if this value falls below stop-writes-avail-pct. It is important for all configured storage files in a namespace to have the same size, otherwise, data_avail_pct could be low even when a lot of space is available across other files.
gauge  integer  Example: Where 5 files of 96MiB each for a given namespace, and each file has 24MiB of data spread across 6 write blocks (with the 8MiB write-block size):
- The data_used_pctis 75%.
- The data_avail_pctis 50%.
- If the distribution is not perfectly uniform (which is usual), data_avail_pctrepresents the file that has the fewest free blocks.
aerospike_namespace_data_compression_ratio  Measures the average compressed size to uncompressed size ratio. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size). device_compression_ratio is not included if the compression configuration parameter is set to none.
gauge  integer  The compression ratio is a moving average calculated based on the most recently written records. Read records do not factor into the ratio. Records that don’t try to compress are not included in the moving average. If the written data changes over time, then the compression ratio changes with it. In case of a sudden change in data, the indicated compression ratio may lag. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.
aerospike_namespace_data_total_bytes  Regardless of storage-engine, the total allocated storage.
gauge  integer  aerospike_namespace_data_used_bytes  Regardless of storage-engine, the total storage allocated is data_total_bytes, and the amount of data used in that storage is data_used_bytes, which includes both user data and record overhead. For more details, see Calculating data storage.
gauge  integer  aerospike_namespace_data_used_pct  Percentage of used storage capacity for this namespace. Calculated as data_used_bytes * 100 / data_total_bytes. Evictions will be triggered when this percentage crosses the configured evict-used-pct.
gauge  integer  aerospike_namespace_dead_partitions  Number of dead partitions for this namespace when using strong-consistency. This is the number of partitions that are unavailable when all roster nodes are present. Requires the use of the revive command to make them available again. Revived nodes restore availability only when all nodes are trusted.
gauge  integer  If dead_partitions is not zero, critical ALERT. If you are certain that there are no potential data inconsistencies or if data inconsistencies are acceptable, consider issuing revive and recluster commands.
aerospike_namespace_deleted_last_bin  Number of objects deleted because their last bin was deleted.
counter  integer  aerospike_namespace_device_available_pct  Measures the minimum contiguous disk space across all devices in a namespace. The namespace will be read only (stop writes) if this value falls below min-avail-pct. It is important for all configured devices in a namespace to have the same size, otherwise, the device_available_pct could be low even when a lot of space is available across other devices.
gauge  integer  - If device_available_pctdrops below 20%, warn your operations group, this condition might indicate that defrag is unable to keep up with the current load.
- If device_available_pctdrops below 15%, critical ALERT.
- If device_available_pctdrops below 5%, usable disk resources are critically low. This condition might result instop_writes.
Not to be confused with device_free_pct which represents the amount of free space across all devices in a namespace and does not take account of the fragmentation. Here is an example to represent the difference between device_free_pct and device_available_pct.  Assume 5 devices of 100MiB each for a given namespace, where each device has 20MiB of data that are spread across 5 write-blocks (where each write-block is 8MiB):
- The device_free_pctwould be 80%.
- The device_available_pctwould be 60%.
- If the distribution is not uniform (it usually is not perfectly uniform) the device_available_pctwould represent the device that has the least free blocks.
aerospike_namespace_device_compression_ratio  Measures the average compressed size to uncompressed size ratio. 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).  device_compression_ratio will not be included if compression is set to none.
moving average  decimal  The compression ratio is a moving average. It is calculated based on the most recently written records. Read records do not factor into the ratio. Records that don’t try to compress are not included in the moving average. If the written data changes over time then the compression ratio will change with it. In case of a sudden change in data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.
aerospike_namespace_device_free_pct  Percentage of disk capacity free for this namespace. This is the amount of free storage across all devices in the namespace. Evictions will be triggered when the used percentage across all devices (which is represented by 100 - device_free_pct) crosses the configured high-water-disk-pct.
gauge  integer  Not to be confused with device_available_pct which represents the amount of free contiguous space on the device that has the least contiguous free space across the namespace. Here is an example to represent the difference between device_free_pct and device_available_pct.  Assume 5 devices of 100MB each for a given namespace, where each device has 25MB of data that are spread across 50 write blocks (let’s assume a 1MB write-block-size):
- The device_free_pctwould be 75%.
- The device_available_pctwould be 50%.
- If the distribution is not uniform (it usually is not perfectly uniform) the device_available_pctwould represent the device that has the least free blocks.
aerospike_namespace_device_total_bytes  Total bytes of disk space allocated to this namespace on this node.
gauge  integer  aerospike_namespace_device_used_bytes  Total bytes of disk space used by this namespace on this node.
gauge  integer  Trending device_used_bytes provides operations insight into how disk usage changes over time for this namespace.
aerospike_namespace_dup_res_ask  Number of duplicate resolution requests made by the node to other individual nodes.
counter  integer  aerospike_namespace_dup_res_respond_no_read  Number of duplicate resolution requests handled by the node without reading the record.
counter  integer  aerospike_namespace_dup_res_respond_read  Number of duplicate resolution requests handled by the node where the record was read.
counter  integer  aerospike_namespace_effective_active_rack  The effective active-rack for the namespace. The configured active rack owns all of the master partition copies.
For strong consistency-enabled namespaces, this is the roster’s current active rack. Otherwise, it is the configured active-rack.
gauge  integer  aerospike_namespace_effective_is_quiesced  Reports ‘true’ when the namespace has rebalanced after previously receiving a quiesce info request.
gauge  integer  aerospike_namespace_effective_prefer_uniform_balance  Applies only to Enterprise Edition. Value can be true or false. If Aerospike applied the uniform balance algorithm for the current cluster state, the value returned is true. If any node having this namespace isn’t configured with prefer-uniform-balance true, the value returned is false and uniform balance algorithm is disabled for this namespace on all participating nodes.
gauge  integer  aerospike_namespace_effective_replication_factor  The effective replication factor for the namespace, included with the namespace info command metrics.
The effective replication factor is less than the replication-factor if the cluster size is smaller than the RF, in which case the effective replication factor would match the cluster size.
In Database 5.7 and earlier, if the paxos-single-replica-limit size is reached, the effective replication factor is 1.
The effective replication factor is 0 for a node that has been orphaned by the cluster. For example, if a node tries to join a cluster but that node is unable to communicate with every other node in the cluster, the principal node rejects the request and the node marks itself as an orphan.
gauge  integer  For AP namespaces in Database 7.1 and earlier, the effective replication factor drops when a node is shut down or crashes, and the remaining nodes are fewer than the RF. In Database 5.7 and earlier, if the paxos-single-replica-limit size is reached, the effective replication factor is 1.
aerospike_namespace_evict_ttl  The current eviction depth, or the highest ttl of records that have been evicted, in seconds.
gauge  integer  aerospike_namespace_evict_void_time  The current eviction depth, expressed as a void time in seconds since 1 January 2010 UTC.
gauge  integer  aerospike_namespace_evicted_objects  Number of objects evicted from this namespace on this node since the server started.
counter  integer  aerospike_namespace_fail_client_lost_conflict  Number of non-XDR write commands that failed because some bin’s last-update-time is greater than the write command’s time. Error code 28 is returned. This can happen only when the XDR bin convergence feature is enabled. This can happen due to either:
- 
a clock skew across DCs causing XDR write commands to write bins with a future timestamp compared to local time. 
- 
a race condition between an incoming XDR write command and a local client write command. 
See fail_xdr_lost_conflict and cluster_max_compatibility_id.
counter  integer  aerospike_namespace_fail_generation  Number of read/write commands failed on generation check.
counter  integer  aerospike_namespace_fail_key_busy  Number of read/write commands that failed on ‘hot keys’, meaning there were already a number of commands queued up higher than transaction-pending-limit for the same record waiting in the rw-hash or rw_in_progress. For read this can only happen when duplicate resolution is necessary.
counter  integer  If the application is not expected to have hot keys and fail_key_busy rate of change exceeds expectations, this condition might indicate a problem with the application.
Detail level logging for the rw context will log transactions (digest) triggering this error. Read transactions would only fail if they had to go through the rw-hash (for example if duplicate resolution are in effect).
aerospike_namespace_fail_mrt_blocked  Number of transactions or read/write commands blocked by an ongoing transaction.
gauge  integer  aerospike_namespace_fail_mrt_version_mismatch  Number of version mismatches - usually in verify reads, but also individual commands (reads/writes/deletes/UDFs) where version checks occur if the record had previously been read in the transaction.
gauge  integer  aerospike_namespace_fail_record_too_big  Number of write commands that failed because a record was larger than max-record-size. Only counts client writes failures on master side.
counter  integer  Detail level logging for the rw context will log transactions (digest) triggering this error (originating from client side master writes). Enabling detail level logging for the drv_ssd context will log all attempts at writing records that are too big, including replica-writes, immigration (migrations) writes and applying duplicate resolution winners. See “How do I change the write-block-size configuration?” for more information.
aerospike_namespace_fail_xdr_forbidden  Number of read/write commands that failed due to configuration restriction. Error code 22 is returned. This counts any of the traffic rejected due to either of the following:
- 
incoming XDR traffic (xdr-write stat) and allow-xdr-writesset to false.
- 
non-XDR write traffic and allow-nonxdr-writesset to false.
counter  integer  aerospike_namespace_fail_xdr_key_busy  Number of XDR key-busy errors (code 32) that have occurred. This error is raised if either of the following occurs:
- ship-versions-policyis- alland a new write is attempted before the most recent update to the record successfully shipped to the destination.
- ship-versions-policyis- intervaland a new write is attempted before at least one version has shipped in the most recent- ship-versions-interval.
counter  integer  aerospike_namespace_fail_xdr_lost_conflict  Number of XDR write commands that did not succeed in updating all the attempted bins. Only a subset of bin updates might have failed or all the bin updates might have failed. This can happen only when the XDR bin convergence feature is enabled. If a conflicting write happens on the same record across two or more data centers, the bin with the earlier last update time will lose during XDR shipping. An XDR retry due to a timeout, where a record that has already been successfully updated at a destination is received again, would fail and this metric will be updated. In other retry scenarios, such as  key busy or device busy, the remote record will not be updated. Only a timeout-based retry can lead to this situation. See fail_client_lost_conflict.
counter  integer  aerospike_namespace_from_proxy_batch_sub_delete_error  Number of batch-index delete subtransactions proxied from another node that failed with an error.
counter  integer  aerospike_namespace_from_proxy_batch_sub_delete_filtered_out  Number of batch-index delete subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_batch_sub_delete_not_found  Number of batch-index delete subtransactions proxied from another node that resulted in not found.
counter  integer  aerospike_namespace_from_proxy_batch_sub_delete_success  Number of records successfully deleted by batch-index subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_delete_timeout  Number of batch-index delete subtransactions proxied from another node that timed out.
counter  integer  aerospike_namespace_from_proxy_batch_sub_lang_delete_success  Number of successful batch-index UDF delete subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_lang_error  Number of language (Lua) batch-index errors for UDF sub-transactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_lang_read_success  Number of successful batch-index UDF read subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_lang_write_success  Number of successful batch-index UDF write subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_read_error  Number of batch-index read sub-transactions proxied from another node that failed with an error.
counter  integer  aerospike_namespace_from_proxy_batch_sub_read_filtered_out  Number of batch-index read subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_batch_sub_read_not_found  Number of batch-index read subtransactions proxied from another node that resulted in not found.
counter  integer  aerospike_namespace_from_proxy_batch_sub_read_success  Number of records successfully read by batch-index subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_read_timeout  Number of batch-index read subtransactions proxied from another node that timed out.
counter  integer  aerospike_namespace_from_proxy_batch_sub_tsvc_error  Number of batch-index subtransactions proxied from another node that failed with an error in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this will include transactions against unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_from_proxy_batch_sub_tsvc_timeout  Number of batch-index subtransactions proxied from another node that timed out in the transaction service, before attempting to handle the transaction.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_from_proxy_batch_sub_udf_complete  Number of completed batch-index UDF subtransactions proxied from another node for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.
counter  integer  aerospike_namespace_from_proxy_batch_sub_udf_error  Number of failed batch-index UDF subtransactions proxied from another node for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.
counter  integer  aerospike_namespace_from_proxy_batch_sub_udf_filtered_out  Number of batch-index UDF subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_batch_sub_udf_timeout  Number of batch-index UDF subtransactions proxied from another node that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.
counter  integer  aerospike_namespace_from_proxy_batch_sub_write_error  Number of batch-index write subtransactions proxied from another node that failed with an error.
counter  integer  aerospike_namespace_from_proxy_batch_sub_write_filtered_out  Number of batch-index write subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_batch_sub_write_success  Number of records successfully written by batch-index subtransactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_batch_sub_write_timeout  Number of batch-index write subtransactions proxied from another node that timed out.
counter  integer  aerospike_namespace_from_proxy_delete_error  Number of errors for delete transactions proxied from another node. This includes xdr_from_proxy_delete_error.
counter  integer  aerospike_namespace_from_proxy_delete_filtered_out  Number of delete transactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_delete_not_found  Number of delete transactions proxied from another node that resulted in not found. This includes xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_from_proxy_delete_success  Number of successful delete transactions proxied from another node. This includes xdr_from_proxy_delete_success.
counter  integer  aerospike_namespace_from_proxy_delete_timeout  Number of timeouts for delete transactions proxied from another node. This includes xdr_from_proxy_delete_timeout.
counter  integer  aerospike_namespace_from_proxy_lang_delete_success  Number of successful UDF delete transactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_lang_error  Number of language (Lua) errors for UDF transactions proxied from another node.
counter  integer  aerospike_namespace_from_proxy_lang_read_success  Number of successful UDF read commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_lang_write_success  Number of successful UDF write commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_read_error  Number of errors for read commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_read_filtered_out  Number of read commands proxied from another node that did not happen because they were filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_read_not_found  Number of read commands proxied from another node that resulted in not found.
counter  integer  aerospike_namespace_from_proxy_read_success  Number of successful read commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_read_timeout  Number of timeouts for read commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_tsvc_error  Number of commands proxied from another node that failed in the transaction service, before attempting to handle the commands. For example protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this will include commands against unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_from_proxy_tsvc_timeout  Number of commands proxied from another node that timed out while in the transaction service, before attempting to handle the commands. At this stage the commands has not yet been identified as a read or a write, but the namespace is known. There could be congestion in the internal transaction queue, or it could be that the timeout set by the client is too aggressive.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_from_proxy_udf_complete  Number of successful UDF commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_udf_error  Number of errors for UDF commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_udf_filtered_out  Number of UDF commands proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_udf_timeout  Number of timeouts for UDF commands proxied from another node.
counter  integer  aerospike_namespace_from_proxy_write_error  Number of errors for write commands proxied from another node. This includes xdr_from_proxy_write_error.
counter  integer  aerospike_namespace_from_proxy_write_filtered_out  Number of write commands proxied from another node that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_from_proxy_write_success  Number of successful write commands proxied from another node. This includes xdr_from_proxy_write_success.
counter  integer  aerospike_namespace_from_proxy_write_timeout  Number of timeouts for write commands proxied from another node. This includes xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_geo_region_query_cells  Number of cell coverings for query region queried.
counter  integer  aerospike_namespace_geo_region_query_falsepos  Number of points outside the region. Total query result points is geo_region_query_points + geo_region_query_falsepos.
gauge  integer  aerospike_namespace_geo_region_query_points  Number of points within the region. Total query result points is geo_region_query_points + geo_region_query_falsepos.
gauge  integer  aerospike_namespace_geo_region_query_reqs  Number of geo queries on the system since the uptime of the node.
counter  integer  aerospike_namespace_hwm_breached  If true, Aerospike has breached ‘high-water-[disk|memory]-pct’ for this namespace.
gauge  boolean  If hwm_breached is true, alert your operations group that memory or disk resources are strained. This condition might indicate the need to increase cluster capacity.
aerospike_namespace_index-type.mount[ix].age  Applies only to Enterprise Edition configured to index-type flash. This shows the percentage of lifetime (total usage) claimed by OEM for underlying device. Value is -1 unless underlying device is NVMe and may exceed 100. ‘ix’ is the device index. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for 2 files specified in the configuration.
gauge  integer  aerospike_namespace_index_flash_alloc_bytes  Applies only to Enterprise Edition configured with index-type flash. Total bytes allocated on the mount(s) for the primary index used by this namespace on this node. This statistic represents entire 4KiB chunks which have at least one element in use.  Also available in the log on the index-flash-usage ticker entry.
gauge  integer  aerospike_namespace_index_flash_alloc_pct  Applies only to Enterprise Edition configured with index-type flash. Percentage of the mount(s) allocated for the primary index used by this namespace on this node. Prior to Database 7.0, calculated as (index_flash_alloc_bytes / index-type.mounts-size-limit) * 100.  In Database 7.0 and later, calculated as (index_flash_alloc_bytes / index-type.mounts-budget) * 100. This statistic represents entire 4KiB chunks which have at least one element in use.  Also available in the log on the index-flash-usage ticker entry.
gauge  integer  If index_flash_alloc_pct gets close to or greater than 100%, alert operations to review the sizing of the namespace.
aerospike_namespace_index_flash_used_bytes  Applies only to Enterprise Edition configured with index-type flash. Total bytes in-use on the mount(s) for the primary index used by this namespace on this node. This is the same value memory_used_index_bytes would have if the index were not persisted.
gauge  integer  aerospike_namespace_index_flash_used_pct  Applies only to Enterprise Edition configured with index-type flash. Percentage of the mount(s) in-use for the primary index used by this namespace on this node. Calculated as (index_flash_used_bytes / index-type.mounts-size-limit) * 100.
gauge  integer  aerospike_namespace_index_mounts_used_pct  Applies only to Enterprise Edition configured with index-type pmem or flash. Percentage of the mount(s) in-use for the primary index used by this namespace on this node.
gauge  integer  aerospike_namespace_index_pmem_used_bytes  Applies only to Enterprise Edition configured with index-type pmem. Total bytes in-use on the mount(s) for the primary index used by this namespace on this node. This is the same value memory_used_index_bytes would have if the index were not persisted.
gauge  integer  aerospike_namespace_index_pmem_used_pct  Applies only to Enterprise Edition configured with index-type pmem. Percentage of the mount(s) in-use for the primary index used by this namespace on this node. Calculated as (index_pmem_used_bytes / index-type.mounts-size-limit) * 100
gauge  integer  aerospike_namespace_index_used_bytes  Amount of memory occupied by the primary index for this namespace. Applies to all types of index storage (index-type.  
gauge  integer  aerospike_namespace_indexes_memory_used_pct  Combined RAM indexes’ size as a percentage of indexes-memory-budget when indexes-memory-budget is configured nonzero.
gauge  integer  aerospike_namespace_master_tombstones  Number of tombstones on this node which are active masters.
gauge  integer  aerospike_namespace_max-evicted-ttl  The highest record TTL that Aerospike has evicted from this namespace.
gauge  integer  aerospike_namespace_max_void_time  Maximum record TTL ever inserted into this namespace.
gauge  integer  aerospike_namespace_memory_free_pct  Percentage of memory capacity free for this namespace.
gauge  integer  If memory_free_pct approaches the configured value for high-water-memory-pct or stop-writes-pct, alert operations to investigate the cause. Might indicate a need to reduce the object count or increase capacity and may require further investigation into memory_used_sindex_bytes if secondary indexes are in use, into memory_used_set_index_bytes if set indexes are used, or into heap_efficiency_pct if data is stored in memory.
aerospike_namespace_memory_used_bytes  Total bytes of memory used by this namespace on this node. Used against the high-water-memory-pct  and stop-writes-pct thresholds. It represents the sum of the following values:
- memory_used_data_bytes
- memory_used_index_bytes
- memory_used_sindex_bytes
- memory_used_set_index_bytes(Database 5.6 and later)
See heap_allocated_kbytes for the total amount of memory allocated on a node other than primary index shared memory in Enterprise Edition and, for Database 6.1 and later, secondary index shared memory in Enterprise Edition.
gauge  integer  Trending used-bytes-memory provides operations insight into how memory usage changes over time for this namespace.
aerospike_namespace_memory_used_data_bytes  Amount of memory occupied by data. See memory_used_bytes for the total memory accounted for the namespace.
gauge  integer  aerospike_namespace_memory_used_index_bytes  Amount of memory occupied by the index for this namespace. Allocated in shared memory by default (index-type shmem) for the Enterprise Edition. 
 If your index is persisted, either in block storage (index-type flash, or in persistent memory (index-type pmem, (Database 4.5 and later), refer instead to index_flash_used_bytes or index_pmem_used_bytes. For these persisted index configurations, the value of memory_used_index_bytes is 0.
See memory_used_bytes for the total memory accounted for the namespace.
gauge  integer  aerospike_namespace_memory_used_set_index_bytes  Amount of memory occupied by set indexes for this namespace on this node. See memory_used_bytes for the total memory accounted for the namespace.
gauge  integer  aerospike_namespace_memory_used_sindex_bytes  Amount of memory occupied by secondary indexes for this namespace on this node. See memory_used_bytes for the total memory accounted for the namespace.
gauge  integer  aerospike_namespace_migrate_fresh_partitions  Number of partitions that are created fresh or empty because a number of nodes, greater than the replication factor, have left the cluster. Applies to AP and SC namespaces.
gauge  integer  aerospike_namespace_migrate_record_receives  Number of record insert request received by immigration.
counter  integer  aerospike_namespace_migrate_record_retransmits  Number of times emigration has retransmitted records.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_migrate_records_skipped  Number of times emigration did not ship a record because the remote node was already up-to-date.
counter  integer  aerospike_namespace_migrate_records_transmitted  Number of records emigration has read and sent.
counter  integer  aerospike_namespace_migrate_records_unreadable  Number of records skipped during migration because they were unreadable when migrate-skip-unreadable is enabled.
counter  integer  aerospike_namespace_migrate_rx_instance_count  Number of instance objects managing immigrations.
gauge  integer  aerospike_namespace_migrate_rx_partitions_active  Number of partitions currently immigrating to this node. If migrate_rx_partitions_active is greater than 0 and cluster is not in maintenance, Operations needs to identify why migrations are running.
gauge  integer  aerospike_namespace_migrate_rx_partitions_initial  Total number of migrations this node will receive during the current migration cycle for this namespace.
gauge  integer  aerospike_namespace_migrate_rx_partitions_remaining  Number of migrations this node has not yet received during the current migration cycle for this namespace.
gauge  integer  aerospike_namespace_migrate_signals_active  For finished partition migrations on this node, number of outstanding clean-up signals, sent to participating member nodes, waiting for clean-up acknowledgment. Signals are messages that are sent from a partition’s master node to all other nodes that currently have data for the partition. The signals are used to notify all nodes that migrations have completed for this partitions and if they aren’t a replica they can now drop the partition.
gauge  integer  aerospike_namespace_migrate_signals_remaining  For unfinished partition migrations on this node, number of clean-up signals to send to participating member nodes, as migration completes. Signals are messages that are sent from a partition’s master node to all other nodes that currently have data for the partition. The signals are used to notify all nodes that migrations have completed for this partitions and if they aren’t a replica they can now drop the partition.
gauge  integer  aerospike_namespace_migrate_tx_instance_count  Number of instance objects managing emigrations.
gauge  integer  aerospike_namespace_migrate_tx_partitions_active  Number of partitions currently emigrating from this node. If migrate_tx_partitions_active is greater than 0 and cluster is not in maintenance, Operations needs to identify why migrations are running.
gauge  integer  aerospike_namespace_migrate_tx_partitions_imbalance  Number of partition migrations failures which could lead to partitions being imbalanced. For each increment there will also be a warning logged.
counter  integer  aerospike_namespace_migrate_tx_partitions_initial  Total number of migrations this node will send during the current migration cycle for this namespace.
gauge  integer  aerospike_namespace_migrate_tx_partitions_lead_remaining  Number of initially scheduled emigrations which are not delayed by the migrate-fill-delay configuration. Lead migrations are typically delta-migrations addressing non-empty partition replica nodes. Delta-migrations generally consume far less storage IO.
gauge  integer  aerospike_namespace_migrate_tx_partitions_remaining  Number of migrations this node not yet sent during the current migration cycle for this namespace.
gauge  integer  aerospike_namespace_mrt_monitor_roll_back_error  Subset of mrt_roll_back_error  where monitor did the roll back.
gauge  integer  aerospike_namespace_mrt_monitor_roll_back_success  Subset of mrt_roll_back_success  where monitor did the roll back.
gauge  integer  aerospike_namespace_mrt_monitor_roll_back_timeout  Subset of mrt_roll_back_timeout  where monitor did the roll back.
gauge  integer  aerospike_namespace_mrt_monitor_roll_forward_error  Subset of mrt_roll_forward_error where monitor did the roll forward.
gauge  integer  aerospike_namespace_mrt_monitor_roll_forward_success  Subset of mrt_roll_forward_success where monitor did the roll forward.
gauge  integer  aerospike_namespace_mrt_monitor_roll_forward_timeout  Subset of mrt_roll_forward_timeout where monitor did the roll forward.
gauge  integer  aerospike_namespace_mrt_monitor_roll_tombstone_creates  Number of times monitor transactions rolls (forward or back) generate tombstones from nothing – this is rare but normal.
gauge  integer  aerospike_namespace_mrt_monitors  The number of mrt_monitors records in a namespace.
gauge  integer  aerospike_namespace_mrt_monitors_active  Number of monitors currently driving roll forwards or roll backs after a transaction timeout.
gauge  integer  aerospike_namespace_mrt_provisionals  Number of provisional records in a transaction.
gauge  integer  aerospike_namespace_mrt_roll_back_error  Number of roll back transactions that failed.
gauge  integer  aerospike_namespace_mrt_roll_back_success  Number of roll back transactions that succeeded.
gauge  integer  aerospike_namespace_mrt_roll_back_timeout  Number of roll back transactions that timed out.
gauge  integer  aerospike_namespace_mrt_roll_forward_error  Number of roll forward transactions that failed.
gauge  integer  aerospike_namespace_mrt_roll_forward_success  Number of roll forward transactions that succeeded.
gauge  integer  aerospike_namespace_mrt_roll_forward_timeout  Number of roll forward transactions that timed out.
gauge  integer  aerospike_namespace_mrt_verify_read_error  Number of verify read commands that failed.
gauge  integer  aerospike_namespace_mrt_verify_read_success  Number of verify read commands that succeeded
gauge  integer  aerospike_namespace_mrt_verify_read_timeout  Number of verify read commands that timed out.
gauge  integer  aerospike_namespace_nodes_quiesced  The number of nodes observed to be quiesced as of the most recent reclustering event. If a single node received the quiesce command, on the subsequent reclustering event, all nodes return 1 for this metric, and when the quiesced node is shutdown, triggering a new reclustering event, this metric returns to 0.
gauge  integer  aerospike_namespace_non_expirable_objects  Number of records in this namespace with non-expirable TTLs (TTLs of value 0).
gauge  integer  aerospike_namespace_non_replica_objects  Number of records on this node which are neither master nor replicas. This number is non-zero during migration, representing additional versions or copies of records. Those are records beyond the replication factor line and would be potentially used during migrations to duplicate resolve. This is not true for quiesced nodes, which retain their partitions after migrations have completed.
gauge  integer  aerospike_namespace_non_replica_tombstones  Number of tombstones on this node which are neither master nor replicas. This number is non-zero only during migration. This is not true for quiesced nodes, which retain their partitions after migrations have completed.
gauge  integer  aerospike_namespace_nsup_cycle_deleted_pct  Percent of records removed by NSUP in its last cycle.
gauge  float  nsup_cycle_deleted_pct is calculated when the NSUP (Namespace SUPervisor) cycle finishes (nsup-done is logged). It is calculated based on the total objects present at the beginning of the NSUP cycle and the number of objects that got deleted in that cycle (nsup_cycle_deleted_pct = (objects removed by NSUP in its last cycle * 100) / number of total objects when the NSUP cycle started [expirable + non expirable]).
aerospike_namespace_nsup_cycle_duration  Length of the last NSUP cycle in seconds.
gauge  integer  aerospike_namespace_nsup_xdr_key_busy  Number of NSUP deletes (expirations and evictions) that had to wait for a previous version to ship. This error is raised if either of the following occurs:
- ship-versions-policyis- alland the most recent update to the record has not yet successfully shipped to the destination.
- ship-versions-policyis- intervaland XDR hasn’t successfully shipped at least one version of the record in the most recent ship-versions-interval in seconds.
counter  integer  aerospike_namespace_objects  Number of records in this namespace for this node. Includes non-replica. Does not include tombstones.
gauge  integer  Trending objects provides operations insight into this namespace’s record fluctuations over time.
aerospike_namespace_ops_sub_tsvc_error  Number of times a background query operate command failed to access a record. For example, due to protocol or permission errors. Does not include timeouts. In strong-consistency enabled namespaces, this includes attempts to access records in unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_ops_sub_tsvc_timeout  Number of records accessed by a background query operate command that timed out in the transaction service.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_ops_sub_write_error  Number of records accessed by a background query operate command write subtransactions that failed with an error. Does not include timeouts.
counter  integer  aerospike_namespace_ops_sub_write_filtered_out  Number of records accessed by a background query operate command write subtransactions for which the write did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_ops_sub_write_success  Number of successful records accessed by a background query operate command write subtransactions.
counter  integer  aerospike_namespace_ops_sub_write_timeout  Number of records accessed by a background query operate command write subtransactions that timed out.
counter  integer  aerospike_namespace_pending_quiesce  Reports ‘true’ when the quiesce info command has been received by a node, or if stay-quiesced is true for the node. When true, the next clustering event will cause this node to quiesce. To trigger a clustering event, issue the recluster info command. To disable, issue the quiesce-undo info command.
gauge  integer  aerospike_namespace_pi_query_aggr_abort  Number of primary index query aggregations that were aborted.
counter  integer  aerospike_namespace_pi_query_aggr_complete  Number of primary index query aggregations that completed.
counter  integer  aerospike_namespace_pi_query_aggr_error  Number of primary index query aggregations that failed.
counter  integer  Compare pi_query_aggr_error to pi_query_aggr_complete.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_pi_query_long_basic_abort  Number of basic long primary index queries that were aborted.
counter  integer  aerospike_namespace_pi_query_long_basic_complete  Number of basic long primary index queries that completed.
counter  integer  aerospike_namespace_pi_query_long_basic_error  Number of basic long primary index queries that failed.
counter  integer  Compare pi_query_long_basic_error to pi_query_long_basic_complete.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_pi_query_ops_bg_abort  Number of ops background primary index queries that were aborted.
counter  integer  aerospike_namespace_pi_query_ops_bg_complete  Number of ops background primary index queries that completed.
counter  integer  aerospike_namespace_pi_query_ops_bg_error  Number of ops background primary index queries that failed.
counter  integer  Compare pi_query_ops_bg_error to pi_query_ops_bg_complete and If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_pi_query_short_basic_complete  Number of basic short primary index queries that completed.
counter  integer  aerospike_namespace_pi_query_short_basic_error  Number of basic short primary index queries that failed.
counter  integer  Compare pi_query_short_basic_error to pi_query_short_basic_complete.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_pi_query_short_basic_timeout  Short primary index queries are not monitored, so they cannot be aborted. They might time out, which is reflected in this statistic.
counter  integer  aerospike_namespace_pi_query_udf_bg_abort  Number of UDF background primary index queries that were aborted.
counter  integer  aerospike_namespace_pi_query_udf_bg_complete  Number of UDF background primary index queries that completed.
counter  integer  aerospike_namespace_pi_query_udf_bg_error  Number of UDF background queries that failed.
counter  integer  Compare pi_query_udf_bg_error to pi_query_udf_bg_complete.
If ratio is higher than acceptable, alert operations to investigate.
aerospike_namespace_pmem_available_pct  Measures the minimum contiguous pmem storage file space across all such files in a namespace. The namespace will be read only (stop writes) if this value falls below min-avail-pct. It is important for all configured pmem storage files in a namespace to have the same size, otherwise, the pmem_available_pct could be low even when a lot of space is available across other files.
gauge  integer  If pmem_available_pct drops below 20%, warn your operations group.
This condition might indicate that defrag is unable to keep up with the current load.
If pmem_available_pct drops below 15%, critical ALERT.
If pmem_available_pct drops below 5%, usable PMem resources are critically low. This condition might result in stop_writes.
Not to be confused with pmem_free_pct which represents the amount of free space across all PMem storage files in a namespace and does not take account of the fragmentation.
 Here is an example to represent the difference between pmem_free_pct and pmem_available_pct. Assume 5 files of 96MiB each for a given namespace, where each file has 24MiB of data that are spread across 6 write-blocks (with the 8MiB write-block-size):
 - The pmem_free_pct would be 75%. - The pmem_available_pct would be 50%. - If the distribution is not uniform (it usually is not perfectly uniform) the pmem_available_pct would represent the file that has the least free blocks.
aerospike_namespace_pmem_compression_ratio  Measures the average compressed size to uncompressed size ratio for PMem storage. 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size). pmem_compression_ratio is not included if the compression configuration parameter is set to none.
moving average  integer  The compression ratio is a moving average, calculated based on the most recently written records. Read records do not factor into the ratio. If the written data changes over time then the compression ratio will change with it. In case of a sudden change in data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.
aerospike_namespace_pmem_free_pct  Percentage of pmem storage capacity free for this namespace. This is the amount of free storage across all pmem storage files in the namespace. Evictions will be triggered when the used percentage across all storage files (which is represented by 100 - pmem_free_pct) crosses the configured high-water-disk-pct.
gauge  integer  Not to be confused with pmem_available_pct which represents the amount of free contiguous space on the PMem storage file that has the least contiguous free space across the namespace.
 Here is an example to represent the difference between pmem_free_pct and pmem_available_pct. Assume 5 files of 96MiB each for a given namespace, where each file has 24MiB of data that are spread across 6 write-blocks (with the 8MiB write-block size):
 - The pmem_free_pct would be 75%. - The pmem_available_pct would be 50%. - If the distribution is not uniform (it usually is not perfectly uniform) the pmem_available_pct would represent the file that has the least free blocks.
aerospike_namespace_pmem_total_bytes  Total bytes of pmem storage file space allocated to this namespace on this node.
gauge  integer  aerospike_namespace_pmem_used_bytes  Total bytes of pmem storage file space used by this namespace on this node.
gauge  Trending pmem_used_bytes provides operations insight into how pmem storage usage changes over time for this namespace.
aerospike_namespace_prole_objects  Number of records on this node which are proles (replicas). Does not include tombstones.
gauge  integer  aerospike_namespace_prole_tombstones  Number of tombstones on this node which are proles (replicas) on this node.
gauge  integer  aerospike_namespace_query_agg  Number of query aggregations attempted. Removed in Database 5.7. Use query_aggr_complete + query_aggr_error + query_aggr_abort instead.
counter  integer  aerospike_namespace_query_agg_abort  Number of query aggregations aborted by the user seen by this node. Renamed to query_aggr_abort in Database 5.7.
counter  integer  aerospike_namespace_query_agg_avg_rec_count  Average number of records returned by the aggregations underlying query. Renamed to query_aggr_avg_rec_count in Database 5.7.
gauge  integer  aerospike_namespace_query_agg_error  Number of query aggregations errors due to an internal error. Renamed to query_aggr_error in Database 5.7.
counter  integer  aerospike_namespace_query_agg_success  Number of query aggregations completed. Renamed to query_aggr_complete in Database 5.7.
counter  integer  aerospike_namespace_query_aggr_abort  Number of query aggregations aborted by the user seen by this node. Removed in Database 6.0, use si_query_aggr_abort.
counter  integer  aerospike_namespace_query_aggr_avg_rec_count  Average number of records returned by the aggregations underlying query.
gauge  integer  aerospike_namespace_query_aggr_complete  Number of query aggregations completed. Removed in Database 6.0, use si_query_aggr_complete.
counter  integer  aerospike_namespace_query_aggr_error  Number of query aggregation errors due to an internal error. Removed in Database 6.0, use si_query_aggr_error.
counter  integer  aerospike_namespace_query_basic_abort  Number of secondary index basic queries that were aborted by a user. Removed in Database 6.0, use si_query_long_basic_abort.
counter  integer  aerospike_namespace_query_basic_avg_rec_count  Average number of records returned by all secondary index basic queries.
gauge  integer  aerospike_namespace_query_basic_complete  Number of secondary index basic queries which completed successfully.
counter  integer  aerospike_namespace_query_basic_error  Number of secondary index basic queries that returned an error. Removed in Database 6.0, use si_query_long_basic_error.
counter  integer  aerospike_namespace_query_fail  Number of queries which failed due to an internal error. Those are failures not part of query lookup (see query_lookup_error), query aggregation (see query_agg_error) or query background UDF (see query_udf_bg_failure).
counter  aerospike_namespace_query_false_positives  Number of entries that were shortlisted from the secondary index but the bin values are not matching the query clause. This might happen when the bin value changes during query execution.
counter  integer  aerospike_namespace_query_long_queue_full  Number of long running queries queue full errors.
counter  integer  aerospike_namespace_query_long_reqs  Number of long running queries ever attempted in the system (query selected record more than query_threshold).
counter  integer  aerospike_namespace_query_lookup_abort  Number of user aborted secondary index queries. Renamed to query_basic_abort in Database 5.7.
counter  integer  aerospike_namespace_query_lookup_avg_rec_count  Average number of records returned by all secondary index query look-ups. Renamed to query_basic_avg_rec_count in Database 5.7.
gauge  integer  aerospike_namespace_query_lookup_error  Number of secondary index query look-up errors. Renamed to query_basic_error in Database 5.7.
counter  integer  aerospike_namespace_query_lookup_success  Number of secondary index look-ups which succeeded. Renamed to query_basic_complete in Database 5.7.
counter  integer  aerospike_namespace_query_lookups  Number of secondary index lookups attempted. Removed in Database 5.7. Use query_basic_complete + query_basic_error + query_basic_abort instead.
counter  integer  aerospike_namespace_query_ops_bg_abort  Number of ops background queries that were aborted. Removed in Database 6.0, use si_query_ops_bg_abort.
counter  integer  aerospike_namespace_query_ops_bg_complete  Number of ops background queries that completed. Removed in Database 6.0, use si_query_ops_bg_complete.
counter  integer  aerospike_namespace_query_ops_bg_error  Number of ops background queries that returned error. Removed in Database 6.0, use si_query_ops_bg_error.
counter  integer  aerospike_namespace_query_ops_bg_failure  Number of ops background queries that failed. Removed from Database 5.7 and later, use query_ops_bg_error + query_ops_bg_abort instead.
counter  integer  aerospike_namespace_query_ops_bg_success  Number of ops background queries that completed. Renamed to query_ops_bg_complete in Database 5.7.
counter  integer  aerospike_namespace_query_proto_compression_ratio  Measures the average compressed size to uncompressed size ratio for protocol message data in query responses to the client. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).
moving average  decimal  The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_query_proto_uncompressed_pct  Measures the percentage of query responses to the client with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).
gauge  instantaneous  The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_query_reqs  Number of query requests ever attempted on this node. Even very early failures would be counted here, as opposed to query_short_running and query_long_running which would increment a bit later.
counter  aerospike_namespace_query_short_queue_full  Number of short running queries queue full errors.
counter  integer  aerospike_namespace_query_short_reqs  Number of short running queries ever attempted in the system (query selected record less than query_threshold).
counter  integer  aerospike_namespace_query_udf_bg_abort  Number of UDF background queries that were aborted. Removed in Database 6.0, use si_query_udf_bg_abort.
counter  integer  aerospike_namespace_query_udf_bg_complete  Number of UDF background queries that completed. Removed in Database 6.0, use si_query_udf_bg_complete.
counter  integer  aerospike_namespace_query_udf_bg_error  Number of UDF background queries which returned error. Removed in Database 6.0, use si_query_udf_bg_error.
counter  integer  aerospike_namespace_query_udf_bg_failure  Number of UDF background queries that failed. Removed from Database 5.7 and later, use query_udf_bg_error + query_udf_bg_abort instead.
counter  integer  aerospike_namespace_query_udf_bg_success  Number of UDF background queries that completed. Renamed to query_udf_bg_complete in Database 5.7.
counter  integer  aerospike_namespace_re_repl_error  Number of re-replication errors which were not timeout. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt.
counter  integer  aerospike_namespace_re_repl_success  Number of successful re-replications. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt.
counter  integer  aerospike_namespace_re_repl_timeout  Number of re-replications that ended in timeout. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt. Starting with Database 6.3 this stat only counts timeouts that happened during the actual re-replication.
counter  integer  The transaction-ttl of a re-replication is 1 second by default (configurable through the transaction-max-ms configuration parameter.
aerospike_namespace_re_repl_tsvc_error  Number of re-replication errors happening in the transaction queue which were not re_repl_tsvc_timeout (before the re-replication attempt). Re-replications occur for namespaces operating under strong-consistency mode when a record does not successfully replicate on the initial attempt.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_re_repl_tsvc_timeout  Number of re-replications that time out early in the internal transaction queue, while waiting to be picked up by a service thread. Re-replications occur for namespaces operating under strong-consistency mode when a record does not successfully replicate on the initial attempt.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_record_proto_compression_ratio  Measures the average compressed size to uncompressed size ratio for protocol message data in single-record transaction client responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).
gauge  decimal  The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_record_proto_uncompressed_pct  Measures the percentage of single-record transaction client responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).
moving average  decimal  The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_retransmit_all_batch_sub_delete_dup_res  Number of retransmits that occurred during batch delete subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_delete_repl_write  Number of retransmits that occurred during batch delete subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  :Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_dup_res  Obsolete as of Database 6.0. In case of a failure to replicate a write transaction across all replicas, the record will be left in the ‘un-replicated’ state, forcing a ‘re-replication’ transaction prior to any subsequent read or write transaction on the record.
Number of retransmits that occurred during batch subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Starting with Database 6.0 when batch-writes were introduced,  “repl-write retransmits” for batch writes are counted as “dup-res retransmits” which are included in the metric retransmit_all_batch_sub_dup_res.
aerospike_namespace_retransmit_all_batch_sub_read_dup_res  Number of retransmits that occurred during batch read subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_read_repl_ping  Number of retransmits that occurred during SC linearized read subtransactions within batched commands. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_udf_dup_res  Number of retransmits that occurred during batch UDF subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_udf_repl_write  Number of retransmits that occurred during batch UDF subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_write_dup_res  Number of retransmits that occurred during batch write subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_batch_sub_write_repl_write  Number of retransmits that occurred during batch write (insert/update/upsert/replace) subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_delete_dup_res  Number of retransmits that occurred during delete transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_delete_repl_write  Number of retransmits that occurred during delete transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_read_dup_res  Number of retransmits that occurred during read commands that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_read_repl_ping  Number of retransmits that occurred during SC linearized reads. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_udf_dup_res  Number of retransmits that occurred during client initiated UDF transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_udf_repl_write  Number of retransmits that occurred during client initiated UDF transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_write_dup_res  Number of retransmits that occurred during write transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_all_write_repl_write  Number of retransmits that occurred during write transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_nsup_repl_write  Number of retransmits that occurred during NSUP initiated delete transactions that were being replica written.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_ops_sub_dup_res  Number of retransmits that occurred during write subtransactions of background ops scan/query jobs that were being duplicate-resolved.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_ops_sub_repl_write  Number of retransmits that occurred during write subtransactions of background ops scan/query jobs that were being replica written.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_udf_sub_dup_res  Number of retransmits that occurred during UDF subtransactions of scan/query background UDF jobs that were being duplicate-resolved.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_retransmit_udf_sub_repl_write  Number of retransmits that occurred during UDF subtransactions of scan/query background UDF jobs that were being replica written.
counter  integer  Retransmission statistics are collected in the retransmits ticker log line.
aerospike_namespace_scan_aggr_abort  Number of scan aggregations that were aborted. Removed in Database 6.0, use pi_query_aggr_abort.
counter  integer  aerospike_namespace_scan_aggr_complete  Number of scan aggregations that completed. Removed in Database 6.0, use pi_query_aggr_complete.
counter  integer  aerospike_namespace_scan_aggr_error  Number of scan aggregations that failed.
counter  integer  Compare scan_aggr_error to scan_aggr_complete.
If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_aggr_error.
aerospike_namespace_scan_basic_abort  Number of basic scans that were aborted. Removed in Database 6.0, use pi_query_long_basic_abort.
counter  integer  aerospike_namespace_scan_basic_complete  Number of basic scans that completed. Removed in Database 6.0, use pi_query_long_basic_complete.
counter  integer  aerospike_namespace_scan_basic_error  Number of basic scans that failed.
counter  integer  Compare scan_basic_error to scan_basic_complete.
If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_long_basic_error.
aerospike_namespace_scan_ops_bg_abort  Number of ops background scans that were aborted. Removed in Database 6.0, use pi_query_ops_bg_abort.
counter  integer  aerospike_namespace_scan_ops_bg_complete  Number of ops background scans that completed. Removed in Database 6.0, use pi_query_ops_bg_complete.
counter  integer  aerospike_namespace_scan_ops_bg_error  Number of ops background scans that failed.
counter  integer  Compare scan_ops_bg_error to scan_ops_bg_complete and If ratio is higher than acceptable  alert operations to investigate. Removed in Database 6.0, use pi_query_ops_bg_error.
aerospike_namespace_scan_proto_compression_ratio  Measures the average compressed size to uncompressed size ratio for protocol message data in basic scan or aggregation scan client responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).
moving average  decimal  The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_scan_proto_uncompressed_pct  Measures the percentage of basic scan or aggregation scan client responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).
gauge  decimal  The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.
aerospike_namespace_scan_udf_bg_abort  Number of UDF background scans that were aborted. Removed in Database 6.0, use pi_query_udf_bg_abort.
counter  integer  aerospike_namespace_scan_udf_bg_complete  Number of UDF background scans that completed. Removed in Database 6.0, use pi_query_udf_bg_complete.
counter  integer  aerospike_namespace_scan_udf_bg_error  Number of UDF background scans that failed.
counter  integer  Compare scan_udf_bg_error to scan_udf_bg_complete.
If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_udf_bg_error.
aerospike_namespace_set-evicted-objects  Number of records evicted by a set.
counter  integer  aerospike_namespace_set_index_used_bytes  Amount of memory occupied by set indexes for this namespace on this node. See Finding total namespace memory for the total memory accounted for the namespace.
gauge  integer  aerospike_namespace_si_query_aggr_abort  Number of secondary index query aggregations aborted by the user seen by this node.
counter  integer  aerospike_namespace_si_query_aggr_complete  Number of secondary index query aggregations completed.
counter  integer  aerospike_namespace_si_query_aggr_error  Number of secondary index query aggregation errors due to an internal error.
counter  integer  aerospike_namespace_si_query_long_basic_abort  Number of basic long secondary index queries aborted for this namespace.
counter  integer  aerospike_namespace_si_query_long_basic_complete  Number of basic long secondary index queries completed for this namespace.
counter  integer  aerospike_sindex_si_query_long_basic_error  Number of basic long secondary index queries that returned error for this namespace.
counter  integer  aerospike_namespace_si_query_ops_bg_abort  Number of ops background secondary index queries that were aborted.
counter  integer  aerospike_namespace_si_query_ops_bg_complete  Number of ops background secondary index queries that completed.
counter  integer  aerospike_namespace_si_query_ops_bg_error  Number of ops background secondary index queries that returned error.
counter  integer  aerospike_namespace_si_query_udf_bg_abort  Number of UDF background secondary index queries that were aborted.
counter  integer  aerospike_namespace_si_query_udf_bg_complete  Number of UDF background secondary index queries that completed.
counter  integer  aerospike_namespace_si_query_udf_bg_error  Number of UDF background secondary index queries which returned error.
counter  integer  aerospike_namespace_sindex-type.mount[ix].age  Applies only to Enterprise Edition configured to sindex-type flash. This shows the percentage of lifetime (total usage) claimed by OEM for underlying device. Value is -1 unless underlying device is NVMe and may exceed 100. ‘ix’ is the device index. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for 2 files specified in the configuration.
gauge  integer  aerospike_namespace_sindex_flash_used_bytes  Applies only to Enterprise Edition configured with sindex-type flash. Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node. This is the same value memory_used_sindex_bytes would have if the secondary indexes were not persisted.
gauge  integer  aerospike_namespace_sindex_flash_used_pct  Applies only to Enterprise Edition configured with sindex-type flash. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_pmem_used_bytes / sindex-type.mounts-size-limit) * 100
gauge  integer  aerospike_namespace_sindex_gc_cleaned  Number of secondary index entries cleaned by sindex GC.
counter  integer  aerospike_namespace_sindex_mounts_used_pct  Applies only to Enterprise Edition configured with sindex-type pmem or flash. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_used_bytes / sindex-type.mounts-budget) * 100
gauge  integer  aerospike_namespace_sindex_pmem_used_bytes  Applies only to Enterprise Edition configured with sindex-type pmem. Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node. This is the same value memory_used_sindex_bytes would have if the secondary indexes were not persisted.
gauge  integer  aerospike_namespace_sindex_pmem_used_pct  Applies only to Enterprise Edition configured with sindex-type pmem. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_pmem_used_bytes / sindex-type.mounts-size-limit) * 100
gauge  integer  aerospike_namespace_sindex_used_bytes  Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node.
gauge  integer  aerospike_namespace_smd_evict_void_time  The cluster-wide specified eviction depth, expressed as a void time in seconds since 1 January 2010 UTC. This is distributed to all nodes via SMD. This may be larger than evict_void_time — evict_void_time will eventually advance to this value.
gauge  integer  aerospike_namespace_stop_writes  If true, this namespace is currently not allowing client-originated writes. Migration writes and prole writes are still allowed. Error code 22 is returned if any one of the following are breached: Prior to Database 7.0:
gauge  integer  If stop-writes is true, critical ALERT.
 Until the cause is corrected, the system will reject all writes.
aerospike_namespace_storage_engine_device_age  Shows percentage of lifetime (total usage) claimed by OEM for underlying storage-engine.device[ix] (may exceed 100). Value will be -1 unless underlying device is NVMe. It is a measure of how much of the drive’s projected lifetime according to the manufacturer has been used at any point in time. When the SSD is brand new, its value will report ‘0’ and when its projected lifetime has been reached, it shows ‘100’, reporting that 100% of the projected lifetime has been used. When the value gets over 100%, the SSD has reached the lifetime specified by the OEM.
gauge  integer  aerospike_namespace_storage_engine_device_defrag_partial_writes  The number of wblocks partial flushed to storage-engine.device[ix] by defrag.
counter  integer  aerospike_namespace_storage_engine_device_defrag_q  Number of wblocks queued to be defragged on storage-engine.device[ix].
gauge  integer  Measured per-device or per-file depending on the storage configuration.
If storage-engine.device[ix].defrag_q or storage-engine.file[ix].defrag_q continues to increase over time, alert operations to investigate.
aerospike_namespace_storage_engine_device_defrag_reads  The number of wblocks that have been sent to the defrag_q from storage-engine.device[ix].
Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.
counter  integer  aerospike_namespace_storage_engine_device_defrag_writes  The number of wblocks defrag has written to storage-engine.device[ix].
counter  integer  aerospike_namespace_storage_engine_device_free_wblocks  The number of wblocks (write blocks) free on storage-engine.device[ix].
gauge  integer  aerospike_namespace_storage_engine_device_partial_writes  The number of wblocks partial flushed to storage-engine.device[ix].
counter  integer  aerospike_namespace_storage_engine_device_read_errors  Number of read errors encountered on storage-engine.device[ix].
counter  integer  aerospike_namespace_storage_engine_device_shadow_write_q  The number of wblocks queued to be written to the shadow device of storage-engine.device[ix].
gauge  integer  aerospike_namespace_storage_engine_device_used_bytes  The number of bytes used for data on storage-engine.device[ix].
gauge  integer  aerospike_namespace_storage_engine_device_write_q  The number of wblocks queued to be written to storage-engine.device[ix]. Includes blocks written by the defragmentation sub-system.
gauge  integer  aerospike_namespace_storage_engine_device_writes  Number of wblocks written to storage-engine.device[ix] since Aerospike started. Does not include defragmentation writes.
counter  integer  Label "device" and "device_index" in all aerospike_namespace_storage_engine_device_* metrics  The raw device that is configured in device configuration in namespace context and storage-engine subcontext. ‘ix’ is the device index. The index value starts from 0. For example, storage-engine.device[0]=/dev/xvd1 and storage-engine.device[1]=/dev/xvc1 for two devices specified in the configuration.
gauge  integer  aerospike_namespace_storage_engine_file_age  Shows the percentage of lifetime (total usage) claimed by OEM for the underlying device of storage-engine.file[ix]. Value will be -1 unless underlying device is NVMe and may exceed 100.
gauge  integer  aerospike_namespace_storage_engine_file_defrag_partial_writes  The number of wblocks partial flushed to storage-engine.file[ix]  by defrag.
counter  integer  aerospike_namespace_storage_engine_file_defrag_q  The number of wblocks queued to be defragged on storage-engine.file[ix].
gauge  integer  aerospike_namespace_storage_engine_file_defrag_reads  Number of wblocks that have been sent to the defrag_q from storage-engine.file[ix].
Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.
counter  integer  aerospike_namespace_storage_engine_file_defrag_writes  The number of wblocks defrag has written to storage-engine.file[ix].
counter  integer  aerospike_namespace_storage_engine_file_free_wblocks  The number of wblocks (write blocks) free on storage-engine.file[ix].
gauge  integer  aerospike_namespace_storage_engine_file_partial_writes  The number of wblocks partial flushed to storage-engine.file[ix]  by writes.
counter  integer  aerospike_namespace_storage_engine_file_shadow_write_q  The number of wblocks queued to be written to the shadow file of storage-engine.file[ix].
gauge  integer  aerospike_namespace_storage_engine_file_used_bytes  Number of bytes used for data on storage-engine.file[ix].
gauge  integer  aerospike_namespace_storage_engine_file_write_q  Number of wblocks queued to be written to storage-engine.file[ix].
gauge  integer  Measured per-device or per-file depending on the storage configuration.
If storage-engine.device[ix].write_q or storage-engine.file[ix].write_q is greater than 1, alert operations to investigate.
aerospike_namespace_storage_engine_file_writes  The number of wblocks written to storage-engine.file[ix]  since Aerospike started.  When running with commit-to-device set to true, this counter will only account for full blocks written and therefore will only count blocks written through the defragmentation process as client writes would write to disk individually rather than at a block level. Includes defragmentation writes.
counter  integer  Label "file" and "file_index" in all aerospike_namespace_storage_engine_file_* metrics  The data file path that is configured in file configuration in namespace context and storage-engine subcontext. ‘ix’ is the file index. The index value starts from 0. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for two files specified in the configuration.
gauge  integer  aerospike_namespace_storage_engine_stripe_age  Shows the percentage of lifetime (total usage) claimed by OEM for the respective storage-backed persistence device of storage-engine.stripe[ix]. The value will be -1 unless the underlying device is NVMe and may exceed 100, check storage-engine.device[ix].age. This statistic is not available in the log ticker and is only applicable if a storage-backed persistence exists.
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage_engine_stripe_backing_write_q  The number of wblocks queued to be written to the respective storage-backed persistence of storage-engine.stripe[ix]. This statistic is available in the log ticker as write-q, and is only applicable if a storage-backed persistence exists.
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
Log ticker example with storage-backed persistence:
INFO (drv-mem): (drv_mem.c:3158) {bar} stripe-0.0xad001000: used-bytes 146499360 free-wblocks 492 write (18,0.2) defrag-q 0 defrag-read (1,0.0) defrag-write (0,0.0) write-q 0Log ticker example without storage-backed persistence:
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-2.0xad002002: used-bytes 887120 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-5.0xad002005: used-bytes 915280 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-1.0xad002001: used-bytes 900080 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-3.0xad002003: used-bytes 896720 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-0.0xad002000: used-bytes 909120 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-7.0xad002007: used-bytes 898960 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-6.0xad002006: used-bytes 897040 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)INFO (drv-mem): (drv_mem.c:3158) {test} stripe-4.0xad002004: used-bytes 895680 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)aerospike_namespace_storage_engine_stripe_defrag_partial_writes  The number of wblocks partial flushed to storage-engine.stripe[ix]  by defrag.
counter  integer  aerospike_namespace_storage_engine_stripe_defrag_q  The number of wblocks queued to be defragged on storage-engine.stripe[ix].
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage-engine_stripe_defrag_reads  Number of wblocks that have been sent to the defrag_q from storage-engine.stripe[ix].
Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.
counter  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage_engine_stripe_defrag_writes  The number of wblocks defrag has written to storage-engine.stripe[ix].
counter  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage-engine_stripe_free_wblocks  Number of wblocks (write blocks) free on storage-engine.stripe[ix].
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage_engine_stripe_partial_writes  The number of wblocks partial flushed to storage-engine.stripe[ix] by writes.
counter  integer  aerospike_namespace_storage_engine_stripe_used_bytes  Number of bytes used for data on storage-engine.stripe[ix].
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_storage-engine.stripe[ix].writes  The number of wblocks written to storage-engine.stripe[ix] since Aerospike started.
When running with commit-to-device set to true, this counter will only account for full blocks written and therefore will only count blocks written through the defragmentation process as the client writes would write to disk individually rather than at a block level. Includes defragmentation writes.
counter  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
Label "stripe" and "stripe_index" in all aerospike_namespace_storage_engine_stripe_* metrics  Stripe is a shared memory segment. Each stripe will have its respective shared memory key, which is internally determined by the server. ‘ix’ is the stripe index. For example, if there are eight stripes, the index(ix) value will be from 0 to 7. So, storage-engine.stripe[0]=stripe-0.0xad002000 and storage-engine.stripe[1]=stripe-1.0xad002001 will show two shared memory segments (stripes) and their keys. This statistic applies to the namespaces configured with storage-engine memory.
gauge  integer  More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.
aerospike_namespace_sub_objects  Number of LDT sub objects. Also aggregated at the service statistic level under the same name.
counter  integer  aerospike_namespace_tombstones  Total number tombstones in this namespace on this node.
gauge  integer  aerospike_namespace_truncate_lut  ‘The most covering truncate_lut for this namespace. See truncate or truncate-namespace.’
gauge  integer  aerospike_namespace_truncated_records  The total number of records deleted by truncation for this namespace (includes set truncations). See truncate or truncate-namespace.
counter  integer  aerospike_namespace_truncating  Indicates when the namespace is in the process of being truncated.
gauge  boolean  aerospike_namespace_ttl_reductions_applied  Incremented when apply-ttl-reduction is true and a command reduces the TTL.
gauge  integer  aerospike_namespace_ttl_reductions_ignored  Incremented when apply-ttl-reduction is false and a command’s attempt to reduce the TTL is ignored. By ignored, the transaction continues and the TTL remains unchanged on the resulting record update.
gauge  integer  aerospike_namespace_udf_sub_lang_delete_success  Number of successful UDF delete sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.
counter  integer  aerospike_namespace_udf_sub_lang_error  Number of UDF sub-transactions errors for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.
counter  integer  aerospike_namespace_udf_sub_lang_read_success  Number of successful UDF read sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.
counter  integer  aerospike_namespace_udf_sub_lang_write_success  Number of successful UDF write sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.
counter  integer  aerospike_namespace_udf_sub_tsvc_error  Number of UDF subtransactions that failed with an error in the transaction service, before attempting to handle the transaction for scan/query background UDF jobs. For example protocol errors or security permission mismatch. Does not include timeouts. In strong-consistency enabled namespaces, this includes transactions against unavailable_partitions and dead_partitions.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_udf_sub_tsvc_timeout  Number of UDF subtransactions that timed out in the transaction service, before attempting to handle the transaction for scan/query background UDF jobs.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_namespace_udf_sub_udf_complete  Number of completed UDF subtransactions for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.
counter  integer  aerospike_namespace_udf_sub_udf_error  Number of failed UDF subtransactions for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses:udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.
counter  integer  aerospike_namespace_udf_sub_udf_filtered_out  Number of UDF subtransactions that did not happen because the record was filtered out with Filter Expressions.
counter  integer  aerospike_namespace_udf_sub_udf_timeout  Number of UDF subtransactions that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.
counter  integer  aerospike_namespace_unavailable_partitions  Number of unavailable partitions for this namespace (when using strong-consistency). This is the number of partitions that are unavailable when roster nodes are missing. Will turn into dead_partitions if still unavailable when all roster nodes are present.
gauge  integer  IF unavailable_partitions is not zero, critical ALERT.
 Check for network issues and make sure the cluster forms properly.
aerospike_namespace_unreplicated_records  Number of unreplicated records in the namespace. Applicable only for namespaces operating under the strong-consistency mode.
gauge  integer  - When a re-replication is triggered, the unreplicated_records stat is decremented as the record goes into the “replicating” state. It is incremented back if the re-replication attempt fails, and the record gets into an unreplicated state again.
- Re-replication could have already been triggered even if a client tsvc timeout happens for the respective transaction that triggered it.
aerospike_namespace_write-smoothing-period  Removed
gauge  integer  aerospike_namespace_xdr_bin_cemeteries  Number of tombstones with bin tombstones. They are generated when bin convergence is enabled and a record is durably deleted.
gauge  integer  aerospike_namespace_xdr_client_delete_error  Number of delete requests initiated by XDR that failed on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_client_delete_not_found  Number of delete requests initiated by XDR that failed on the namespace on this node due to the record not being found. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, [xdr_client_delete_error](/database/reference/metrics#namespace__xdr_client_delete_error(, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_client_delete_success  Number of delete requests initiated by XDR that succeeded on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_client_delete_timeout  Number of delete requests initiated by XDR that timed out on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_client_write_error  Number of write requests initiated by XDR that failed on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_client_write_success  Number of write requests initiated by XDR that succeeded on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_client_write_timeout  Number of write requests initiated by XDR that timed out on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_from_proxy_delete_error  Number of errors for XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_from_proxy_delete_not_found  Number of XDR delete commands proxied from another node that resulted in not found. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_from_proxy_delete_success  Number of successful XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_from_proxy_delete_timeout  Number of timeouts for XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.
counter  integer  aerospike_namespace_xdr_from_proxy_write_error  Number of errors for XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_from_proxy_write_success  Number of successful XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_from_proxy_write_timeout  Number of timeouts for XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.
counter  integer  aerospike_namespace_xdr_tombstones  Number of tombstones on this node which are created by XDR for non-durable client deletes. This includes both master and prole.
gauge  integer  For namespaces configured with XDR, non-durable delete transactions create XDR tombstones (not to be confused with the durable delete tombstones).
XDR tombstones are deleted after they have been shipped via XDR. The XDR tomb raider runs as specified in xdr-tomb-raider-period and uses xdr-tomb-raider-threads to reduce the index and delete XDR tombstones where the last update time (LUT) is older than the current global last ship time (GLST). The GLST is computed as the lowest value across the last ship time (LST) of all the partitions for the namespace. This is done by having each node send the LST for each partition they own to the principal node which then determines the lowest value and sends it back to all nodes in the cluster via the system metadata (SMD) fabric channel.
Node_stats
aerospike_node_stats_batch_index_complete  Number of batch index requests completed.
counter  integer  aerospike_node_stats_batch_index_created_buffers  Number of 128KB response buffers created.  Response buffers are created when there are no buffers left in the pool. If this number consistently increases and there is available memory, you should increase batch-max-unused-buffers.
counter  integer  aerospike_node_stats_batch_index_delay  Number of times a batch index response buffer has been delayed (WOULDBLOCK on the send). The number of times a batch index transaction is completely abandoned because it went over its overall allocated time after being delayed is counted under the batch_index_error statistic and will have a WARNING log message associated.
counter  integer  aerospike_node_stats_batch_index_destroyed_buffers  Number of 128KB response buffers destroyed.  Response buffers are destroyed when there is no slot left to put the buffer back into the pool. The maximum response buffer pool size is batch-max-unused-buffers.
counter  integer  aerospike_node_stats_batch_index_error  Number of batch index requests that completed with an error when, for example, the client has timed out but the server is still attempting to send response buffers back. Another occurrence is if the server abandons the transaction due to encountering delays (WOULDBLOCK on send) of more than twice the total timeout set by the client, or 30 seconds if not set when sending response buffers back. This is accompanied by a WARNING log message. Starting with version 6.4, this statistic is incremented when a transaction experiences delays exceeding the client timeout by a factor of 1. Each encountered delay is counted under the batch_index_delay statistic.
counter  integer  Compare batch_index_error to batch_index_complete. If ratio is higher than acceptable, alert Operations to investigate.
aerospike_node_stats_batch_index_huge_buffers  Number temporary response buffers created that exceeded 128KB.  Huge buffers are created when one of the records is retrieved that is greater than 128KB. Huge records do not benefit from batching and can result in excessive memory thrashing on the server. The batch_index_created_buffers and batch_index_destroyed_buffers do include the huge buffers created and destroyed.
counter  integer  aerospike_node_stats_batch_index_initiate  Number of batch index requests received.
counter  integer  aerospike_node_stats_batch_index_proto_compression_ratio  Measures the average compressed size to uncompressed size ratio for protocol message data in batch index responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).
moving average  decimal  The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.
aerospike_node_stats_batch_index_proto_uncompressed_pct  Measures the percentage of batch index responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).
gauge  decimal  The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.
aerospike_node_stats_batch_index_queue  Number of batch index requests (transactions count) processed and response buffer blocks used on each batch queue.
Format: Q1_REQUESTS:Q1_BUFFERS, Q2_REQUESTS:Q2_BUFFERS, ...
The buffer block counter is actually decremented on batch responses before the transaction count is decremented. Therefore, it is possible for a buffer slot becomes available on the queue and a new batch transaction count is incremented before the previous batch command count is decremented. It is also possible that multiple transactions came in for a thread for which none of the response buffers has been created yet. Finally, batch_index_huge_buffers are counted as part of the buffer blocks used on each batch queue.
gauge  integer  aerospike_node_stats_batch_index_timeout  Number of batch index requests that timed-out on the server before being processed. Those would be caused by a batch subtransaction that has timed out for this batch index transaction. The overall time allowed for a batch-index transaction on the server is not bound, except if a delay is encountered (WOULDBLOCK on send).
For Database 4.1 through 6.3, the overall batch index transaction max delay time is twice the total timeout set by the client, or 30 seconds if there is no timeout set by the client.
For Database 6.4 and later, the overall batch index transaction max delay time is the same as set by the client, or 30 seconds if there is no timeout set by the client.
counter  integer  aerospike_node_stats_batch_index_unused_buffers  Number of available 128 KB response buffers currently in buffer pool.
gauge  integer  aerospike_node_stats_client_connections  Number of active client connections to this node. Also available in the log on the fds proto ticker line.
gauge  integer  - 
If client_connectionsis below an expected low value, then this condition might indicate a problem with the network between clients and server.
- 
If client_connectionsis greater than an expected high value, then this condition might indicate a problem with clients rapidly opening and closing sockets.
- 
If client_connectionsis at or nearproto_fd_max, then the server is either currently unable to accept new connections or might soon be unable to do so.
aerospike_node_stats_client_connections_closed  Number of client connections that have been closed. One of client_connections_opened or client_connections_closed should be closely monitored or alerted against. Also available in the log on the fds proto ticker line.
counter  integer  aerospike_node_stats_client_connections_opened  Number of client connections created to this node since the node was started. One of client_connections_opened or client_connections_closed should be closely monitored or alerted against. Also available in the log on the fds proto ticker line.
counter  integer  If client_connections_opened changes unexpectedly without clients having been added or removed, or a significant change in workload having occurred, this condition might indicate a slow down on a node or a connectivity issue on the node.
aerospike_node_stats_cluster_clock_skew_ms  Current maximum clock skew in milliseconds between nodes in a cluster. Will trigger clock_skew_stop_writes when breaching the cluster_clock_skew_stop_writes_sec threshold. This threshold is normally 20 seconds for strong-consistency namespaces on any Aerospike version, or 40 seconds for AP namespaces where NSUP is enabled (nsup-period is not zero) in Database 4.5.1 or later.
gauge  integer  aerospike_node_stats_cluster_clock_skew_stop_writes_sec  The threshold at which any namespace that is set to strong-consistency stops accepting writes due to clock skew (cluster_clock_skew_ms).
This value is in seconds, not milliseconds.
Although this value shows as 0 for AP namespaces, starting with Database 4.5.1, these namespaces stop accepting writes if NSUP is enabled (nsup-period is not zero) and the clock skew exceeds 40 seconds.
gauge  integer  aerospike_node_stats_cluster_generation  A 64 bit unsigned integer incremented on a node for every successful cluster partition re-balance or transition to orphan state. This is a node local value and does not need to be the same across the cluster.
counter  integer  aerospike_node_stats_cluster_integrity  When false, indicates integrity issues within the cluster, meaning that some nodes are either faulty or dead. A node in the succession list is deemed faulty if the node is alive and it reports to be an orphan or is part of some other cluster. Another condition for a faulty node would be for it to be alive but having a clustering protocol identifier that does not match the rest of the cluster. When true, indicates that the cluster is in a whole and complete state (as far as the nodes that it sees and is able to connect to all concerned). Information about a cluster integrity fault is also logged to the server log file repeatedly.
gauge  integer  aerospike_node_stats_cluster_is_member  When false, indicates that the node is not joined to a cluster; that is, it is an orphan. When true, indicates that the node is joined to a cluster.
gauge  integer  aerospike_node_stats_cluster_key  Randomly generated 64 bit hexadecimal string used to name the last Paxos cluster state agreement.
gauge  integer  aerospike_node_stats_cluster_max_compatibility_id  Each node has a compatibility ID that is an integer based on the node’s database version. During upgrades, this value is used to determine software compatibility. cluster_max_compatibility_id indicates the cluster’s maximum software version. See cluster_min_compatibility_id.
gauge  integer  aerospike_node_stats_cluster_min_compatibility_id  Each node has a compatibility ID that is an integer based on the node’s database version. During upgrades, this value is used to determine software compatibility. cluster_min_compatibility_id indicates the cluster’s minimum software version. See cluster_max_compatibility_id.
gauge  aerospike_node_stats_cluster_principal  This specifies the Node ID of the current cluster principal. Will be ‘0’ on an orphan node.
gauge  integer  aerospike_node_stats_cluster_size  Size of the cluster. Can be checked to make sure the size of the cluster is the expected one after adding or removing a node. Check across all nodes in a cluster.
gauge  integer  If cluster_size does not equal the expected cluster size and the cluster is not undergoing maintenance, your operations group needs to investigate.
aerospike_node_stats_demarshal_error  Number of errors during the demarshal step.
counter  integer  aerospike_node_stats_deprecated_requests  Number of times a deprecated feature has been used.
counter  integer  aerospike_node_stats_early_tsvc_batch_sub_error  Number of errors early in the transaction for batch subtransactions. For example, bad/unknown namespace name or security authentication errors.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_early_tsvc_client_error  Number of errors early in the transaction for direct client requests. Those include transactions hitting the proto-fd-max, transactions with a bad/unknown namespace name or security authentication errors. Those also include cases where partitions are unavailable in AP mode, when clients attempt transactions against an orphan node.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_early_tsvc_from_proxy_batch_sub_error  Number of errors early in the commands for batch subtransactions proxied from another node. For example, bad or unknown namespace name or security authentication errors.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_early_tsvc_from_proxy_error  Number of errors early in the commands for commands, other than batch subtransactions, proxied from another node, for example, bad or unknown namespace name or security authentication errors.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_early_tsvc_ops_sub_error  Number of errors early in an internal ops subtransaction (records accessed by a background query operate command). For example, bad or unknown namespace name or security authentication errors.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_early_tsvc_udf_sub_error  Number of errors early in the transaction for UDF subtransactions. For example, bad or unknown namespace name or security authentication errors.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_entries_per_bval  Ratio of entries to unique bvals (bin values) for a given secondary index on the node. The value is an integer (rounded to the nearest integer) and is calculated using hyperloglog estimates for unique bvals. The stat is generated by a background process. A value of 0 means the stat is not yet generated. The process runs when the secondary index is created and populated, at startup and every hour thereafter. A low value means that the index is highly selective.
gauge  integer  This stat appears in the response to the sindex-stat info command to retrieve statistics for a specified namespace and index. For example, asinfo -v 'sindex-stat:ns=namespace1;indexname=index21'.
aerospike_node_stats_entries_per_rec  Ratio of entries to unique records for a given secondary index on the node. This value will always be 1 if it is not a list or map secondary index. The value is an integer (rounded to the nearest integer) and is calculated using hyperloglog estimates for unique recs. The stat is generated by a background process. A value of 0 means the stat is not yet generated. The process runs at startup, every hour thereafter, and when a secondary index is created and populated.
gauge  integer  This stat appears in the response to the ‘sindex-stat’ info command to retrieve statistics for a specified namespace and index.  For example, asinfo -v 'sindex-stat:ns=namespace1;indexname=index21'.
aerospike_node_stats_err_storage_defrag_fd_get  Removed
counter  integer  aerospike_node_stats_err_sync_copy_null_node  Number of errors during cluster state exchange because of missing general node information.
counter  integer  aerospike_node_stats_fabric_bulk_recv_rate  Rate of traffic (bytes/sec) received by the fabric bulk channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_bulk_send_rate  Rate of traffic (bytes/sec) sent by the fabric bulk channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_connections  Number of active fabric connections to this node. Also available in the log on the fds proto ticker line.
gauge  integer  aerospike_node_stats_fabric_connections_closed  Number of fabric connections that have been closed. Also available in the log on the fds proto ticker line.
counter  integer  aerospike_node_stats_fabric_connections_opened  Number of fabric connections created to this node since the node was started. Also available in the log on the fds proto ticker line.
counter  integer  If fabric_connections_opened is unexpectedly changing, alert as this condition would indicate a connectivity problem with a node or a cluster change.
aerospike_node_stats_fabric_ctrl_recv_rate  Rate of traffic (bytes/sec) received by the fabric ctrl channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_ctrl_send_rate  Rate of traffic (bytes/sec) sent by the fabric ctrl channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_meta_recv_rate  Rate of traffic (bytes/sec) received by the fabric meta channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_meta_send_rate  Rate of traffic (bytes/sec) sent by the fabric meta channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_rw_recv_rate  Rate of traffic (bytes/sec) received by the fabric meta channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_fabric_rw_send_rate  Rate of traffic (bytes/sec) sent by the fabric rw channel during the last ticker-interval (every 10 seconds by default).
gauge  integer  aerospike_node_stats_failed_best_practices  Indicates true if any of the best-practices, which are checked when the server starts, were violated, otherwise failed_best_practices will indicate false. Each failed best-practice will log a unique warning message and a list of failed best-practices can be queried using the best-practices info command.
gauge  boolean  aerospike_node_stats_heap_active_kbytes  The amount of memory in in-use pages, in KiB. An in-use page is a page that has some allocated memory (either partial or full).
gauge  integer  aerospike_node_stats_heap_allocated_kbytes  The amount of memory, in KiB, allocated by the asd daemon. The heap_allocated_kbytes / heap_active_kbytes ratio (6.0 or later) and heap_allocated_kbytes / heap_mapped_kbytes ratio (prior to 6.0) (also provided under heap_efficiency_pct) provide a picture of the fragmentation of the heap. This is for all memory usage except for the shared memory parts (for the primary index in the Enterprise Edition).
gauge  integer  aerospike_node_stats_heap_efficiency_pct  Provides an indication of the jemalloc heap fragmentation. This represents the heap_allocated_kbytes / heap_active_kbytes ratio. A lower number indicates a higher fragmentation rate.
gauge  integer  If heap_efficiency_pct goes below 60% or 50% (depending on configuration, advise your operations group to investigate.
aerospike_node_stats_heap_mapped_kbytes  Amount of memory in mapped pages in KiB, such as the amount of memory that JEM received from the Linux kernel. Should be a multiple of 4, which is the typical page size (4096 bytes).
gauge  integer  aerospike_node_stats_heap_site_count  Number of distinct sites in the server code (specific locations in server functions) that have allocated heap memory designated for tracking as governed by the debug-allocations setting from the time when the server was started. The heap_site_count is only nonzero when debug-allocations is set to a value other than  none. The heap_site_count value can only increase.
counter  integer  aerospike_node_stats_heartbeat_connections  Number of active heartbeat connections to this node. Also available in the log on the fds proto ticker line.
gauge  integer  aerospike_node_stats_heartbeat_connections_closed  Number of heartbeat connections that have been closed. Also available in the log on the fds proto ticker line.
counter  integer  aerospike_node_stats_heartbeat_connections_opened  Number of heartbeat connections created to this node since the node was started. Also available in the log on the fds proto ticker line.
counter  integer  If heartbeat_connections_opened is unexpectedly changing, alert as this condition would indicate a connectivity problem with a node or a cluster change.
aerospike_node_stats_heartbeat_received_foreign  Total number of heartbeats received from remote nodes.
counter  integer  aerospike_node_stats_heartbeat_received_self  Total number of multicast heartbeats from this node received by this node. Will be 0 for mesh.
counter  integer  aerospike_node_stats_info_complete  Number of info requests completed.
counter  integer  aerospike_node_stats_info_queue  Number of info requests pending in info queue.
gauge  integer  aerospike_node_stats_info_timeout  Tracks total timed-out info transactions. Related to info-max-ms.
counter  integer  aerospike_node_stats_long_queries_active  Number of queries currently active (formerly queries_active or scans_active). The long_queries_active stat is shared by both primary index (PI) queries and secondary index (SI) queries. Only long queries are monitored.
gauge  integer  aerospike_node_stats_migrate_allowed  This indicates whether migrations are allowed or not on a node. true when allowed, false when not. When there is a change in a cluster, this statistic’s value will change to false until the rebalance is completed across all namespaces. The rebalance is the step that figures out all partition migrations that need to be scheduled. The rebalance is not the migrations itself but the process that precedes the partitions migrations. migrate_allowed true indicates that all migrations related statistics have been set and can be leveraged programmatically, for example, migrate_partitions_remaining to check if migrations are ongoing or not).
gauge  integer  aerospike_node_stats_migrate_partitions_remaining  This is the number of partitions remaining to migrate (in either direction). When migrate_allowed is true, this is the stat which will accurately determine if migrations are complete for a single node across all namespaces. There could be a short period after a reclustering event when this statistic shows 0 but the migrations have not started yet. During such time, migrate_allowed would return false.
gauge  integer  aerospike_node_stats_objects  Total number of replicated objects on this node. Includes master and replica objects.
gauge  integer  Trending objects provides operations insight into object fluctuations over time.
aerospike_node_stats_paxos_principal  Identifier for the node in which this node believes to be the Paxos Principal.
gauge  integer  aerospike_node_stats_process_cpu_pct  Percentage of CPU usage by the asd process.
gauge  integer  aerospike_node_stats_proxy_in_progress  Number of proxies in progress. Also called proxy hash. The command’s TTL (client set timeout or transaction-max-ms is checked every 5ms (Database 6.0 and later) when waiting in the proxy-hash.
gauge  integer  aerospike_node_stats_queries_active  Number of queries currently active (formerly scans_active). The bqueries_active stat is shared by both primary index (PI) queries and secondary index (SI) queries. Only long queries are monitored. Removed in Database 6.1, use long_queries_active.
gauge  integer  aerospike_node_stats_query_bad_records  Number of false positive entries in secondary index queries.
counter  integer  aerospike_node_stats_query_long_running  Number of long running queries currently in process.
gauge  integer  aerospike_node_stats_query_short_running  Number of short running queries currently in process.
gauge  integer  aerospike_node_stats_query_tracked  Number of queries tracked by the system. (Number of queries which ran more than query untracked_time (default 1 sec)).
counter  integer  aerospike_node_stats_read_touch_error  Number of read touch errors which were not timeouts.
counter  integer  aerospike_node_stats_read_touch_skip  Number of touches abandoned upon finding that another write (including an earlier touch) has taken place or is taking place, removing the need to proceed with the touch.
counter  integer  aerospike_node_stats_read_touch_success  Number of successful read touches.
counter  integer  aerospike_node_stats_read_touch_timeout  Number of touches that ended in timeout.
counter  integer  aerospike_node_stats_read_touch_tsvc_error  Number of read touch subtransactions that failed with an error in the internal transaction queue. Does not include timeouts.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_read_touch_tsvc_timeout  Number of read touches that time out early in the internal transaction queue, while waiting to be picked up by a service thread.
The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.
counter  integer  aerospike_node_stats_reaped_fds  Number of idle client connections closed.
counter  integer  If reaped_fds are growing more rapidly than normal , it may indicate client[s] are opening and closing sockets too rapidly — potential application issue.
aerospike_node_stats_rw_err_dup_write_cluster_key  Removed
counter  integer  aerospike_node_stats_rw_err_dup_write_internal  Removed
counter  integer  aerospike_node_stats_rw_in_progress  Number of rw transactions in progress. Also called rw hash. This tracks transaction parked on the rw hash while processing on other nodes (all write replicas, read duplicate resolutions). The transaction’s TTL (client set timeout or transaction-max-ms is checked every 5ms in Database 6.0 and later when waiting in the rw-hash.
gauge  integer  Depends on expected workload.
If rw_in_progress is higher than expected, or if this deviates more than acceptable from the established baseline over time,alert operations to investigate the cause. May indicate a slowdown on a particular node or overloading on the fabric.
While a transaction is parked in the rw-hash, other transactions for the same record will be queued (those queued transactions wouldn’t be counted in this metric). Once a transaction completes, queued transactions for the same records get re-started (as tracked in the xxxx-restart benchmark histograms (such as write-restart). At that point, the first transaction to be processed will take the rw-hash slot and the other ones will wait for the next round. Transactions that need to be serialized (such as writes for the same record or a read transaction in strong consistency mode while a write transaction is in progress or any transaction requiring duplicate resolution) would not be proceed until they get their slot in the rw-hash.
aerospike_node_stats_scans_active  Number of scans currently active. Removed in Database 6.0, use queries_active.
gauge  integer  aerospike_node_stats_sindex_gc_garbage_cleaned  Sum of secondary index garbage entries cleaned by sindex GC. Moved to namespace level as sindex_gc_cleaned in Database 5.7.
counter  integer  aerospike_node_stats_sindex_gc_garbage_found  Sum of secondary index garbage entries found by sindex GC.
counter  integer  aerospike_node_stats_sindex_gc_list_creation_time  Sum of time spent in finding secondary index garbage entries by sindex GC (millisecond).
counter  integer  aerospike_node_stats_sindex_gc_list_deletion_time  Sum of time spent in cleaning sindex garbage entries by sindex GC (millisecond).
counter  integer  aerospike_node_stats_sindex_gc_objects_validated  Number of secondary index entries processed by sindex GC.
counter  integer  aerospike_node_stats_sindex_gc_retries  Number of retries when sindex GC cannot get sprigs lock. Replaced sindex_gc_locktimedout.
counter  integer  aerospike_node_stats_sindex_ucgarbage_found  Number of un-cleanable garbage entries in the sindexes encountered through queries.
counter  integer  aerospike_node_stats_stat_cluster_key_err_ack_rw_trans_reenqueue  Number of Read/Write trans re-enqueued because of cluster key mismatch.
counter  integer  aerospike_node_stats_stat_cluster_key_partition_transaction_queue_count  Removed/unused
counter  integer  aerospike_node_stats_stat_cluster_key_prole_retry  Number of times a prole write was retried as a result of a cluster key mismatch.
counter  integer  aerospike_node_stats_stat_cluster_key_regular_processed  Number of successful transactions that passed the cluster key test.
counter  integer  aerospike_node_stats_stat_cluster_key_trans_to_proxy_retry  Number of times a proxy was redirected.
counter  integer  aerospike_node_stats_stat_cluster_key_transaction_reenqueue  Removed/unused
counter  integer  aerospike_node_stats_stat_evicted_set_objects  Number of objects evicted from a Set due to set limits defined in Aerospike configuration.
counter  integer  aerospike_node_stats_stat_single_bin_records  Removed: Number of single bin records.
counter  integer  aerospike_node_stats_stat_slow_trans_queue_batch_pop  Number of times we moved a batch of trans from slow queue to fast queue.
counter  integer  aerospike_node_stats_stat_slow_trans_queue_pop  Number of trans that were moved from slow queue to fast queue.
counter  integer  aerospike_node_stats_stat_slow_trans_queue_push  Number of trans that we pushed onto the slow queue.
counter  integer  aerospike_node_stats_storage_defrag_wait  Number of times the defrag waited (called sleep).
counter  integer  aerospike_node_stats_sub_objects  Number of LDT sub objects. Aggregated over the sub_objects stat at the namespace level.
counter  integer  aerospike_node_stats_system_free_mem_kbytes  Amount of free system memory in kilobytes. Includes buffers and caches, but not shared memory.
gauge  integer  If system_free_mem_kbytes is abnormally low, could indicate the server is approaching the limits of the available RAM. Operations should investigate and potentially add nodes or increase per node RAM.
aerospike_node_stats_system_free_mem_pct  Percentage of free system memory.
gauge  integer  If system_free_mem_pct is abnormally low, could indicate the server is approaching the limits of the available RAM. Operations should investigate and potentially add nodes or increase per node RAM.
aerospike_node_stats_system_kernel_cpu_pct  Percentage of CPU usage by processes running in kernel mode.
gauge  integer  aerospike_node_stats_system_thp_mem_kbytes  Amount of memory in use by the Transparent Huge Page mechanism, in kilobytes.
gauge  integer  aerospike_node_stats_system_total_cpu_pct  Percentage of CPU usage by all running processes. Equal to system_user_cpu_pct + system_kernel_cpu_pct.
gauge  integer  aerospike_node_stats_system_user_cpu_pct  Percentage of CPU usage by processes running in user mode.
gauge  integer  aerospike_node_stats_threads_detached  Number of detached server threads currently running.
gauge  integer  aerospike_node_stats_threads_joinable  Number of joinable server threads currently running.
gauge  integer  aerospike_node_stats_threads_pool_active  Number of currently active threads in the server thread pool.
gauge  integer  aerospike_node_stats_threads_pool_total  Total number of threads in the server thread pool.
gauge  integer  aerospike_node_stats_time_since_rebalance  Number of seconds since the last reclustering event, either triggered by the recluster info command or by a cluster disruption (such as a node being add/removed or a network disruption).
gauge  integer  aerospike_node_stats_tree_gc_queue  This is the number of trees queued up, ready to be completely removed (partitions drop). Corresponds to the tree-gc-q entry in the log ticker.
gauge  integer  aerospike_node_stats_tscan_aborted  Number of scans that were aborted. Removed as of 3.6.0.
counter  integer  aerospike_node_stats_tscan_initiate  Number of new scan requests initiated. Removed as of 3.6.0.
counter  integer  aerospike_node_stats_tscan_pending  Number of scan requests pending. Removed as of 3.6.0.
gauge  integer  aerospike_node_stats_tscan_succeeded  Number of scan requests that have successfully finished. Removed as of 3.6.0.
counter  integer  aerospike_node_stats_uptime  Time in seconds since last server restart.
gauge  integer  If uptime is below 300 and the cluster is not undergoing maintenance this node restarted within the last 5 minutes. Advise operations to investigate.
Sets
aerospike_sets_device_data_bytes  Device storage used by this set in bytes, for the data part (does not include index part). Value will be 0 if data is not stored on device. For size used in memory, See memory_data_bytes.
gauge  integer  aerospike_sets_memory_data_bytes  Memory used by this set in bytes, for the data part (does not include index part). Value will be 0 if data is not stored in memory. For size used on disk, See device_data_bytes (available in Database 5.2 and later), or the set level object size histogram.
gauge  integer  aerospike_sets_ns  Namespace name this set belongs to.
gauge  integer  aerospike_sets_objects  Total number of objects (master and all replicas) in this set on this node. This is updated in real time and is not dependent on the nsup-period or nsup-hist-period configurations.
gauge  integer  aerospike_sets_set  Name of this set.
gauge  integer  aerospike_sets_tombstones  Total number of tombstones (master and all replicas) in this set on this node.
gauge  integer  aerospike_sets_truncate_lut  ‘The most covering truncate_lut for this set. See truncate or truncate-namespace.’
gauge  integer  Sindex
aerospike_sindex_delete_error  Number of errors while processing a delete transaction for this secondary index.
counter  integer  aerospike_sindex_delete_success  Number of successful delete transactions processed for this secondary index.
counter  integer  aerospike_sindex_entries  Number of secondary index entries for this secondary index. This is the number of records that have been indexed by this secondary index.
gauge  integer  aerospike_sindex_ibtr_memory_used  Amount of memory, in bytes, the secondary index is consuming for the keys, as opposed to nbtr_memory_used which is the amount of memory the secondary index is consuming for the entries. The total being reported by si_accounted_memory.
gauge  integer  aerospike_sindex_keys  Number of secondary keys for this secondary index.
gauge  integer  aerospike_sindex_load_pct  Progress in percentage of the creation of secondary index.
gauge  integer  aerospike_sindex_load_time  Time it took for the secondary index to be fully created.
gauge  integer  aerospike_sindex_loadtime  Time it took for the secondary index to be fully created.
gauge  integer  aerospike_sindex_memory_used  Amount of memory, in bytes, consumed by the secondary index. Renamed to used_bytes in Database 6.3. Do not use memory_used in Database 6.3 and later.
gauge  integer  aerospike_sindex_nbtr_memory_used  Amount of memory, in  bytes, the secondary index is consuming for the entries, as opposed to ibtr_memory_used which is the amount of memory the secondary index is consuming for the keys. The total being reported by si_accounted_memory.
gauge  integer  aerospike_sindex_query_agg  Number of query aggregations attempted for this secondary index on this node.
counter  integer  aerospike_sindex_query_agg_avg_rec_count  Average number of records returned by the aggregations underlying queries against this secondary index.
gauge  integer  aerospike_sindex_query_agg_avg_record_size  Average size of the records returned by the aggregations underlying queries against this secondary index.
gauge  integer  aerospike_sindex_query_avg_rec_count  Average number of records returned by the all queries against this secondary index (combines query_agg_avg_rec_count and query_lookup_avg_rec_count).
gauge  integer  aerospike_sindex_query_avg_record_size  Average size of the records returned by all the queries against this secondary index (combines   query_agg_avg_record_size and query_lookup_avg_record_size)
gauge  integer  aerospike_sindex_query_basic_abort  Number of basic queries aborted for this secondary index. Removed in Database 6.0, use si_query_long_basic_abort.
counter  integer  aerospike_sindex_query_basic_avg_rec_count  Average number of records returned by the lookup queries against this secondary index.
gauge  integer  aerospike_sindex_query_basic_complete  Number of basic queries completed for this secondary index. Removed in Database 6.0, use si_query_long_basic_complete.
counter  integer  aerospike_sindex_query_basic_error  Number of basic queries that returned error for this secondary index. Removed in Database 6.0, use si_query_long_basic_error.
counter  integer  aerospike_sindex_query_lookup_avg_rec_count  Average number of records returned by the lookup queries against this secondary index. Renamed to query_basic_avg_rec_count in Database 5.7.
gauge  integer  aerospike_sindex_query_lookup_avg_record_size  Average size of the records returned by the lookup queries against this secondary index.
gauge  integer  aerospike_sindex_query_lookups  Number of lookup queries ever attempted for this secondary index on this node. Removed in Database 5.7. Use query_basic_complete  + query_basic_error +  query_basic_abort instead.
counter  integer  aerospike_sindex_query_reqs  Number of query requests ever attempted for this secondary index on this node (combines   query_lookups and query_agg).
counter  integer  aerospike_sindex_si_accounted_memory  Amount of memory, in bytes, the secondary index is consuming. Removed in Database 5.7 the sum of ibtr_memory_used and nbtr_memory_used.
gauge  integer  aerospike_sindex_si_query_short_basic_complete  Number of basic short secondary index queries completed for this secondary index.
counter  integer  aerospike_sindex_si_query_short_basic_error  Number of basic short secondary index queries that returned error for this secondary index.
counter  integer  aerospike_sindex_si_query_short_basic_timeout  Short queries are not monitored, so they cannot be aborted. They might time out, which is reflected in this statistic.
counter  integer  aerospike_sindex_stat_gc_recs  Number of records that have been garbage collected out of the secondary index memory. See sindex-gc-period and sindex-gc-max-rate configuration parameters for tuning the secondary index garbage collection. ”
counter  integer  aerospike_sindex_stat_gc_time  Amount of time spent processing garbage collection for the secondary index. See sindex-gc-period and sindex-gc-max-rate configuration parameters for tuning the secondary index garbage collection.
counter  integer  aerospike_sindex_used_bytes  Amount of memory, in bytes, consumed by the secondary index.
NOTE: Renamed from memory_used in Database 6.3.
gauge  integer  aerospike_sindex_write_error  Number of errors while processing a write transaction for this secondary index.
counter  integer  Users
aerospike_users_conns_in_use  Number of client connections for a given user.
gauge  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled, per node user metrics are available from the security protocol.
aerospike_users_limitless_read_scan_query  Limitless read query requests per second for a given user.
moving average  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics available from the security protocol. For more information, see Enable access control.
aerospike_users_limitless_write_scan_query  Limitless write query requests per second for a given user.
moving average  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.
aerospike_users_read_scan_query_rps  Read query requests per second for a given user.
gauge  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. See Enable access control for more information about these metrics.
aerospike_users_read_single_record_tps  Read transactions per second for a given user.
moving average  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.
aerospike_users_write_scan_query_rps  Write query requests per second for a given user.
moving average  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see  Enable access control.
aerospike_users_write_single_record_tps  Write transactions per second for a given user.
moving average  integer  To see metrics from asadm use the command:
show users statisticsIf you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.
When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.
Xdr
aerospike_xdr_abandoned  Number of records abandoned because of permanent failure at the destination. The destination configuration must be changed for these records to be successfully shipped.
counter  integer  If abandoned is consistently higher than expected alert operations to investigate.
aerospike_xdr_active_failed_node_sessions  Number of active failed node sessions pending. A failed node session keeps track of node at the local cluster that have left the cluster and need other nodes to ship on their behalf until they join back.
gauge  integer  aerospike_xdr_active_link_down_sessions  Number of active link down sessions pending. A link down session keeps track of destination clusters that are not reachable for a given time window.
gauge  integer  aerospike_xdr_bytes_shipped  Number of bytes shipped for a namespace to a DC by XDR.
counter  decimal  Use the asinfo command get-stats to report these metrics.
aerospike_xdr_compression_ratio  Running average compression ratio. Example: asinfo -h localhost -l -v  get-stats:context=xdr;dc=aerospike_b;namespace=test 
moving average  decimal  aerospike_xdr_dc_as_open_conn  Number of open connection to the Aerospike DC. If the DC accepts pipeline writes, there will be 64 connections per destination node. Replaced dc_open_conn starting with Database 4.4.
gauge  integer  aerospike_xdr_dc_as_size  The cluster size of the destination Aerospike DC. Replaced by dc_size starting with Database 4.4.
gauge  integer  aerospike_xdr_dc_http_good_locations  Number of URLs that are considered healthy and being used by the change notification system. Part of the change notification.
gauge  integer  aerospike_xdr_dc_http_locations  Number of URLs configured for the HTTP destination. Part of the change notification.
gauge  integer  aerospike_xdr_dc_ship_attempt  Number of records that have been attempted to be shipped, but could have resulted in either success or error. See dc_ship_success for successfully shipped records.
counter  integer  aerospike_xdr_dc_ship_bytes  Number of bytes shipped for this DC.
counter  integer  aerospike_xdr_dc_ship_delete_success  Number of delete transactions that have been successfully shipped. This is the per DC statistic for xdr_ship_delete_success.
counter  integer  aerospike_xdr_dc_ship_destination_error  Number of errors from the remote cluster(s) while shipping records for this DC. Errors include out-of-space, key-busy, etc. This is the per DC statistic for xdr_ship_destination_error.
counter  integer  aerospike_xdr_dc_ship_idle_avg  Average number of ms of sleep for each record being shipped. 0.000 if there is no throttling. Throttling will occur if the set throughput limit (xdr-max-ship-throughput) has been reached or in case of unexpected slowdown at the destination cluster. This is part of the rsas entry in the logs (xdr context).
gauge  integer  aerospike_xdr_dc_ship_idle_avg_pct  Representation in percent of total time spent for dc_ship_idle_avg. This is part of the rsas entry in the logs (xdr context).
gauge  integer  aerospike_xdr_dc_ship_inflight_objects  Number of records that are inflight (which have been shipped but for which a response from the remote DC has not yet been received).
gauge  integer  aerospike_xdr_dc_ship_latency_avg  Moving average of shipping latency for the specific DC.
moving average  integer  aerospike_xdr_dc_ship_source_error  Number of client layer errors while shipping records for this DC. Errors include timeout, bad network fd, etc. This is the per DC statistic for xdr_ship_source_error.
counter  integer  aerospike_xdr_dc_ship_success  Number of records that have been successfully shipped. This is the per DC statistic for xdr_ship_success.
counter  integer  aerospike_xdr_dc_state  State of the DC. Here are the different statuses: CLUSTER_INACTIVE, CLUSTER_UP, CLUSTER_DOWN, CLUSTER_WINDOW_SHIP. 
 - The CLUSTER_INACTIVE state is for a DC that has not been seeded (configured) in the XDR stanza and would be a place holder for a future dynamic seeding. 
 - The CLUSTER_UP state is the normal state for a DC that is able to receive records from an XDR client and is currently not having any records being shipped to it from a previous window where it was down (which would be the CLUSTER_WINDOW_SHIP state).
 - A cluster will be in CLUSTER_DOWN when the source (XDR client) cannot connect to it for over 30 seconds. This would prevent the entries in the digestlog to be reclaimed. The XDR client will periodically try to reconnect and upon succeeding, will spawn a window shipper to ‘catch up’ then entries in the digestlog that were missed. The DC specific lag (dc_timelag) will increase in such state but will not be accounted for in the overall XDR timelag (xdr_timelag).
 - A cluster states switches to CLUSTER_WINDOW_SHIP when it can be re-connected to after being in CLUSTER_DOWN state. The DC specific lag (dc_timelag) will be accounted for in the overall XDR timelag (xdr_timelag).
gauge  string  aerospike_xdr_dc_timelag  Time lag for this specific DC. See xdr_timelag for details of how this is calculated.
gauge  integer  If dc_timelag consistently greater than a few seconds it may indicate network connectivity issues or errors writing at a destination cluster.
aerospike_xdr_dlog_free_pct  Percentage of the digest log free and available for use.
gauge  integer  aerospike_xdr_dlog_logged  Number of records logged into digest log.
counter  integer  Trending stat_recs_logged allows operations insight into how many records are being enqueued for shipment over time.
aerospike_xdr_dlog_overwritten_error  Number of digest log entries that got overwritten.
counter  integer  aerospike_xdr_dlog_processed_link_down  Number of linkdown that were processed.
counter  integer  aerospike_xdr_dlog_processed_main  Number of records processed on the local Aerospike server.
counter  integer  aerospike_xdr_dlog_processed_replica  Number of records processed for a node in the cluster that is not the local node.
counter  integer  aerospike_xdr_dlog_relogged  Number of records relogged by this node into the digest log due to temporary issues when attempting to ship. A relogged digest log entry would be caused by one of three potential conditions: - An issue with the local client when attempting to ship (tracked by xdr_ship_source_error). - An issue with the network or the destination cluster itself (tracked by xdr_ship_destination_error). - An issue when reading the record on the local node(tracked by xdr_read_error), but those would actually end up relogged on the node now owning the record (see relogged_outgoing).
counter  integer  The XDR component typically processes only master record’s digest log entries on a given node (the exception being during failed node processing, when a node on the source cluster has failed). When relogging such master record’s dlog entry, the corresponding prole copy would also be relogged on the respective node holding the replicas. This would increment the relogged_outgoing statistic on the current node and the relogged_incoming on the receiving node. It is therefore expected to see the dlog_relogged and relogged_outgoing statistics matching for clusters that are stable (no migrations).
 The relogs happening due to master partition ownership changes (migrations) are also tracked through relogged_incoming and relogged_outgoing.
 Permanent errors will not be relogged but will have a WARNING log message at the destination cluster (for example, to name a few, invalid namespace, record too big if mismatched write-block-size between source and destination, authentication or permission error).
 Some Permanent Errors: AEROSPIKE_ERR_RECORD_TOO_BIG, AEROSPIKE_ERR_REQUEST_INVALID, AEROSPIKE_ERR_ALWAYS_FORBIDDEN.
 Some Transient Errors: AEROSPIKE_ERR_SERVER, AEROSPIKE_ERR_CLUSTER_CHANGE, AEROSPIKE_ERR_SERVER_FULL, AEROSPIKE_ERR_CLUSTER, AEROSPIKE_ERR_RECORD_BUSY, AEROSPIKE_ERR_DEVICE_OVERLOAD, AEROSPIKE_ERR_FAIL_FORBIDDEN.
 See the C client errors for the exhaustive list.
aerospike_xdr_dlog_used_objects  Total number of records slots used in the digest log.
gauge  integer  aerospike_xdr_filtered_out  Number of local records that are skipped after having been read but before actual shipment. Such records might be skipped because of the configured shipping rules. For example, if the rules exclude all bins of a record, the record is skipped.
 This counter does not include records not submitted to the XDR queue, such as a record that is not eligible for shipping because its set is disabled.
counter  integer  aerospike_xdr_global_lastshiptime  Minimum last ship time in millisecond (epoch) for XDR for across the cluster. Specifies to what point can slots in the digest log can be reclaimed, by tracking the oldest last ship time across all nodes in the cluster.
gauge  integer  aerospike_xdr_hot_keys  Number of times a record write is skipped from processing because that record is already pending processing. This value also includes the number of records skipped for replica partitions.
counter  integer  aerospike_xdr_hotkey_fetch  If there are hot keys in the system (same record updated quite frequently), XDR optimizes by not shipping all the updates. This stat represents the number of record’s digest that are actually shipped because their cache entries expired and were dirty.  Interpret in conjunction with xdr_hotkey_skip. The timeout of the cache entries is controlled by xdr-hotkey-time-ms.
counter  integer  aerospike_xdr_hotkey_skip  Replaces noship_recs_dup_intrabatch and noship_recs_genmismatch. If there are hot keys in the system (same record updated quite frequently), XDR optimizes by not shipping all the updates. This stat represents the number of record’s digests that are skipped due to an already existing entry in the reader’s thread cache (meaning a version of this record was just shipped). Interpret in conjunction with xdr_hotkey_fetch. The timeout of the cache entries is controlled by xdr-hotkey-time-ms.
counter  integer  aerospike_xdr_in_progress  Number of records that are pending completion. Records can be in different stages like local read, network send, pending acknowledgment. If a record is being retried (see retry_conn_reset, retry_dest, and retry_no_node, it is not considered complete and repeats the cycle.
gauge  integer  aerospike_xdr_in_queue  Number of records in the in-memory transaction queue still to be processed. These are the records which have been written into the xdr transaction-queue but have not been picked up yet to processed further by XDR.
gauge  integer  aerospike_xdr_lag  Lag in seconds between the destination and the source datacenters. This gives an indication of how much behind the source lags in term of shipping records, or, in other terms, how long have records been waiting at the source before being shipped to that DC. 
 Here are a bit more details:
 The lag is the difference between the last update time of the records being shipped (called ‘last ship time’ or LST) and the current time. The LST is internally maintained per partition and aggregated at the namespace level (minimum across all partitions). The lag can seem unsettled (step function) while recoveries are in progress (See the recoveries_pending statistic). This is because the recovery for a partition can take a while and the LST is updated only on completion of a recovery pass (as opposed to per record). A recovery pass is considered complete only after the batch of records for a given partition is completely and successfully shipped (no elements left in the retry queue).
gauge  integer  If lag is consistently greater than a few seconds, this condition might indicate network connectivity issues or errors writing at a destination cluster.<br /
aerospike_xdr_lap_us  Time in microseconds (μsecs) taken to process records across partitions in one lap (processing cycle). This is diagnostic information. A higher number indicates slowness of source in processing the records. 
 Available only at the dc level, not namespace level. Example: asinfo -h localhost -l -v  get-stats:context=xdr;dc=aerospike_b 
gauge  integer  If lap_us is consistently higher than expected alert operations to investigate.
aerospike_xdr_latency_ms  Average network latency for the successfully shipped latency. This value does not include timed-out shipment attempts or any other errors. Updated every log ticker interval (10 seconds by default).
Available only at the dc level, not namespace level. Example: asinfo -h localhost -l -v  get-stats:context=xdr;dc=aerospike_b 
gauge  moving average  Depending on configuration, latency_ms should be within the latency of the link between the DCs.
If latency_ms increases beyond the expectations based on the distance (or known link latency) between clusters, alert operations to investigate.
aerospike_xdr_local_recs_migration_retry  Number of records missing in a batch call, generally a result of migrations, but can also be caused by expiration and eviction.
counter  integer  aerospike_xdr_nodes  Number of nodes in the destination DC as seen by XDR. There may be some delay for the remote changes to be reflected in this stat, especially on node departure, as XDR gives some grace period before removing a node.
gauge  integer  aerospike_xdr_not_found  Number of local records not found by XDR when attempting to read them. Such records might have been expired, evicted, or deleted.
counter  integer  aerospike_xdr_queue_overflow_error  Number of XDR queue overflow errors. Typically happens when there are no physical space available on the storage holding the digest log, or if the writes are happening at such a rate that elements are not written fast enough to the digest log. The number of entries this queue can hold is 1 million.
counter  integer  aerospike_xdr_read_active_avg_pct  This statistics reflects how busy the XDR read threads are by calculating, the average time in percent of total time that the XDR read threads spend actually processing something vs. waiting for a new digest log entry to arrive on their queues from the dlogreader / failed node shippers / window shippers.
moving average  integer  aerospike_xdr_read_error  Number of read requests initiated by XDR that failed. Those are rare, but if present, would typically be caused by reservation failures (node lost master and/or prole ownership of the partition the record belonged to during migrations). This will cause the record’s digest log entry to be relogged to the node now owning the partition (tracked under relogged_outgoing). Other rare cases would be for example when running out of memory or failure to access the storage layer. For the total number of XDR initiated read requests, sum up the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.
counter  integer  aerospike_xdr_read_idle_avg_pct  This is a sister statistic to xdr_read_active_avg_pct and represents the average time in percent of total time that the XDR read threads waits for a new digest log entry to arrive on their queues from the dlogreader / failed node shippers / window shippers.
moving average  integer  aerospike_xdr_read_latency_avg  Moving average latency in milliseconds for XDR to read a record.
moving average  integer  aerospike_xdr_read_notfound  Number of read requests initiated by XDR that were not found. These do not get relogged. This would typically happen if a record is updated and then deleted, but a lag caused the entry to for the record update to be processed after the record has been deleted. For the total number of XDR initiated read requests, sum the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.
counter  integer  aerospike_xdr_read_reqq_used  How many digest log entries are currently in the XDR read threads queues. Each XDR read thread has an in-memory queue with a capacity of 1,000 log entries associated with it. See also related statistic xdr_read_reqq_used_pct. When the dlogreader / failed node shipper / window shipper cannot write to a queue, because the queue is full, it blocks, until there’s space in the queue again.
gauge  integer  aerospike_xdr_read_reqq_used_pct  Sister statistic to xdr_read_reqq_used to represent how full in percent the XDR read request queues are.
gauge  integer  aerospike_xdr_read_respq_used  How many entries are being used in the XDR read response queues. Those queues are used to hand back records after they have been locally fetched. Those queues are similar to the queues referred to in the xdr_read_reqq_used stat except for the fact that they are not bounded. The throttling would happen at the XDR read request queues.
gauge  integer  aerospike_xdr_read_success  Number of read requests initiated by XDR that succeeded. For the total number of XDR initiated read requests, sum up the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.
counter  integer  aerospike_xdr_read_txnq_used  Number of XDR read commands that are in flight in the local transaction queue. XDR limits to 10,000 the number of outstanding XDR read requests. The requests are placed in an internal transaction queue. See xdr_read_txnq_used_pct for the percent used in this queue.
gauge  integer  aerospike_xdr_read_txnq_used_pct  Percent used of the XDR read commands that are in flight (out of a maximum allowed of 10,000) in the transaction queue.  It is an internal transaction queue. See xdr_read_txnq_used for the number of XDR issued reads that are in flight.
gauge  integer  aerospike_xdr_recoveries  Number of partitions that are recovered by reducing the primary index of that partition. Recovery is done when the in-memory transaction queue of the partition is either full or if necessary records are not present in the in-memory transaction queue.
 See also recoveries_pending.
counter  integer  If recoveries is consistently increasing alert operations to investigate.
aerospike_xdr_recoveries_pending  Number of recoveries currently pending.
If recoveries_pending is zero, there are no recoveries in progress. Non-zero indicates the number of recoveries in progress.
gauge  integer  If recoveries_pending is unexpectedly increasing alert operations to investigate.
aerospike_xdr_relogged_incoming  Number of records relogged into this node’s digest log by another node. This typically happens during the following situations:
- 
migrations at the source cluster, when there are outstanding digest log entries and the partition ownership changes by the time they are processed, if the local node does not own master or prole copy of the partition such record belongs to, the node now owning the master copy of the partition would get an incoming digest log entry relogged to it. 
- 
when a node relogs record’s digest log entries to itself ( dlog_relogged), it will also relog those for the node owning the prole counterpart.
counter  integer  The sending node will then have its relogged_outgoing statistic incremented.
aerospike_xdr_relogged_outgoing  Number of records relogged to another node’s digest log. This typically happens during the following situations:
 - migrations at the source cluster, when there are outstanding digest log entries for which the local node does not own either master or prole partition for the record anymore (xdr_read_error)
 - when a node relogs record’s digest log entries to itself (dlog_relogged), it will also relog those for the node owning the prole counterpart.
counter  integer  The receiving node will then have its relogged_incoming statistic incremented.
aerospike_xdr_retry_conn_reset  Number of records whose shipment is retried due to a reset of the connection to the remote datacenter. A connection can be reset due to timeouts (10s), network problems, or destination node restarts. 
 This statistic can increase in bursts. Because of the XDR pipeline, there can be many records that are retried when a connection is reset.
counter  integer  If retry_conn_reset is consistently higher than expected alert operations to investigate.
aerospike_xdr_retry_dest  Number of records retried due to a temporary error returned by destination node. The destination node has responded with a specific error code; therefore, such errors are not related to the network. Such errors include key busy and device overload.
counter  integer  If retry_dest is consistently higher than expected alert operations to investigate.
aerospike_xdr_retry_no_node  Number of records retried because XDR cannot determine which destination node is the master. 
 This typically happens when XDR does not discover the full cluster of the destination, perhaps due to firewall settings. In such a case, the master for all partitions cannot be known. The other possibility is that the entire namespace is not present on the destination cluster.
counter  integer  If retry_no_node is consistently higher than expected alert operations to investigate.
aerospike_xdr_ship_bytes  Estimated number of bytes XDR has shipped to remote clusters.
counter  integer  aerospike_xdr_ship_compression_avg_pct  Used to determine how beneficial compression is (higher is better).
moving average  integer  aerospike_xdr_ship_delete_success  Number of delete operations that were successfully shipped.
aerospike_xdr_ship_destination_error  Number of errors from the remote cluster(s) while shipping records. Errors include timeout, out-of-space, key-busy, etc. Those would be typically relogged, except in case of permanent error (tracked under xdr_ship_destination_permanent_error — for example records too big or some bad namespace configuration), in which case they trigger a WARNING log message at the destination. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.
counter  integer  aerospike_xdr_ship_destination_permanent_error  Number of permanent errors from the remote cluster(s) while shipping records. Example errors include records too big or some bad namespace configuration, in which case they trigger a WARNING log message at the destination and will not be relogged. These do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error. For all errors while shipping to a destination, see xdr_ship_destination_error.
counter  integer  aerospike_xdr_ship_fullrecord  Number of records that did not take advantage of bin level shipping (see xdr-ship-bins).
gauge  integer  aerospike_xdr_ship_inflight_objects  Number of objects that are inflight (which have been shipped but for which a response from the remote DC has not yet been received).
gauge  integer  aerospike_xdr_ship_latency_avg  Moving average latency in milliseconds to ship a record to remote Aerospike clusters. This is computed by dividing time into 1 second intervals.
gauge  integer  Depending on configuration, xdr_ship_latency_avg should be within the latency of the link between the DCs.
If xdr_ship_latency_avg increases beyond the expectations based on the distance (or known link latency) between clusters,  alert operations to investigate.
The average is calculated over each 1 second interval separately and then thrown into the exponential moving average. The exponential moving average is actually a moving average of independent 1-second averages. This is done to avoid having some time intervals where there is a much higher volume of transactions having a heavier weight compared to time intervals with much fewer transactions.
aerospike_xdr_ship_outstanding_objects  Number of outstanding records not yet processed. This only applies to the main thread and will not account for digest log entries pending window shipper or failed node processing. It represents the difference between the write pointer position and the read pointer position. It also does not account for entries pending in the queue prior to being flushed to the digest log, which can go up to 100 entries or 500ms if not full by that time (configurable through xdr-digestlog-iowait-ms).
gauge  integer  Trending xdr_ship_outstanding_objects allows operations insight into how the XDR record transmit queue size changes over time.
aerospike_xdr_ship_source_error  Number of client layer errors while shipping records. Errors include connection errors, bad network fd, etc. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.
counter  integer  aerospike_xdr_ship_success  Number of records successfully shipped to remote Aerospike clusters (across all datacenters configured, meaning one record successfully shipped to 3 different datacenters will increment this counter by 3). Includes xdr_ship_delete_success. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.
counter  integer  aerospike_xdr_stat_pipe_reads_diginfo  Number of digest information read from the named pipe.
counter  integer  aerospike_xdr_success  Number of records successfully shipped to remote datacenters.
counter  integer  If success is consistently lower than expected alert operations to investigate.
aerospike_xdr_throughput  Number of records successfully shipped per second. Updated every log ticker interval (10 secs by default).
gauge  integer  aerospike_xdr_timelag  Time in seconds it took the latest shipped record from the moment it was first written at the source until it was attempted to be shipped to the destination cluster. This is equivalent to the time its digestlog entry waited in the digestlog before being processed. Each record written at the source is timestamped as it gets written into the XDR digestlog.
gauge  integer  [Removed in 5.0] If xdr_timelag is consistently greater than a few seconds, this condition might indicate network connectivity issues or errors writing at a destination cluster.
The knowledge base article on FAQ - What are the causes of XDR throttling might be helpful.
When having multiple destination DCs, this represents the maximum time lag across all the remote DCs that are not in the CLUSTER_INACTIVE or CLUSTER_DOWN states (see dc_state). Under normal operations, though, the timelag for each DC that are in the CLUSTER_UP state will be the same, given that XDR ships records in lock-step. The timelag at each DC would be different when a DC is in the CLUSTER_DOWN or in the CLUSTER_WINDOW_SHIP state. This does not represent the time it will take for XDR to ‘catch up’, nor does it necessarily relate to the number of outstanding digests in the digest log still to be processed. For per DC time lag, see dc_timelag.
aerospike_xdr_uncompressed_pct  Running average percentage of records not compressed because they are below the compression threshold (100) or failed to be compressed at all. See also related parameter enable-compression.
moving average  decimal  aerospike_xdr_uninitialized_destination_error  Number of records in the digest log not shipped because the destination cluster has not been initialized for a DC that is configured for a namespace. This should not happen. Those errors are not counted as xdr_ship_*_error.
counter  integer  aerospike_xdr_unknown_namespace_error  Number of records in the digest log not shipped because they belong to an unknown namespace, on the source cluster. One situation where this would happen is if a namespace is removed (or the order of namespaces is changed in the configuration) while there are some entries in the digest log not processed yet. This should not happen in most cases. Those errors are not counted as xdr_ship_*_error.
counter  integer  