Skip to content

Metrics Reference

See the Metrics command examples for information on usage.

Namespace

appeals_records_exonerated
optional
Context
namespace
Prometheus Name
aerospike_namespace_appeals_records_exonerated
Description

Number of records that were marked replicated as result of an appeal. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
appeals_rx_active
optional
Context
namespace
Prometheus Name
aerospike_namespace_appeals_rx_active
Description

Number of partition appeals currently being received. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
appeals_tx_active
optional
Context
namespace
Prometheus Name
aerospike_namespace_appeals_tx_active
Description

Number of partition appeals currently being sent. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
appeals_tx_remaining
optional
Context
namespace
Prometheus Name
aerospike_namespace_appeals_tx_remaining
Description

Number of partition appeals not yet sent. Partition appeals will happen for namespaces operating under the strong-consistency mode when a node needs to validate the records it has when joining the cluster. Appeals occur after a node has been cold-started. The replication state of each record is lost on cold-start and all records must assume an unreplicated state. An appeal resolves replication state from the partition’s acting master. These are important for performance; an unreplicated record will need to re-replicate to be read which adds latency. During a rolling cold-restart, an operator may want to wait for the appeal phase to complete after each restart to minimize the performance impact of the procedure.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
auto_revived_partitions
watch
Context
namespace
Prometheus Name
aerospike_namespace_auto_revived_partitions
Description

Number of partitions that the auto-revive feature revived at startup.

Introduced
7.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
available_bin_names
optional
Context
namespace
Prometheus Name
aerospike_namespace_available_bin_names
Description

Remaining number of unique bins that the user can create for this namespace.

The formula for the associated metrics is as follows:

bin_names_quota - bin_names = available_bin_names

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_delete_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_delete_error
Description

Number of batch-index delete sub-batches that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_delete_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_delete_filtered_out
Description

Number of batch-index delete sub-batches that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_delete_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_delete_not_found
Description

Number of batch-index delete sub-batches that resulted in not found.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_delete_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_delete_success
Description

Number of records successfully deleted by batch-index sub-batches.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_delete_timeout
Description

Number of batch-index delete sub-batches that timed out.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_lang_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_lang_delete_success
Description

Number of successful batch-index UDF delete sub-batches.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_lang_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_lang_error
Description

Number of language (Lua) batch-index errors for UDF sub-transactions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_lang_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_lang_read_success
Description

Number of successful batch-index UDF read sub-batches.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_lang_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_lang_write_success
Description

Number of successful batch-index UDF write sub-batches.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_proxy_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_proxy_complete
Description

Number of proxied batch-index sub-batches that completed.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_proxy_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_proxy_error
Description

Number of proxied batch-index sub transactions that failed with an error.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_proxy_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_proxy_timeout
Description

Number of proxied batch-index sub-batches that timed out.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_read_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_read_error
Description

Number of batch-index read subtransaction that failed with an error. For example: invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_read_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_read_filtered_out
Description

Number of batch-index read sub-batches that were skipped because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_read_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_read_not_found
Description

Number of batch-index read subtransaction that resulted in not found.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_read_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_read_success
Description

Number of records successfully read by batch-index sub-batches.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_read_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_read_timeout
Description

Number of batch-index read sub-batches that timed out.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_tsvc_error
Description

Number of batch-index read sub-batches that failed with an error in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatches. In strong-consistency enabled namespaces, this includes transactions against unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes, and they are counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_tsvc_timeout
Description

Number of batch-index read sub-batches that timed out in the transaction service, before attempting to handle the transaction.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes, and they are counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_udf_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_udf_complete
Description

Number of completed batch-index UDF sub-batches for scan/query background UDF jobs. See the following statistics for the underlying operation statuses batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success .

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_udf_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_udf_error
Description

Number of failed batch-index UDF sub-batches for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses: batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_udf_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_udf_filtered_out
Description

Number of batch-index UDF sub-batches that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_udf_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_udf_timeout
Description

Number of batch-index UDF sub-batches that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, batch_sub_lang_write_success.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_write_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_write_error
Description

Number of batch-index write sub-batches that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_write_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_write_filtered_out
Description

Number of batch-index write sub-batches that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_write_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_write_success
Description

Number of records successfully written by batch-index sub-batches.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
batch_sub_write_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_batch_sub_write_timeout
Description

Number of batch-index write sub-batches that timed out.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
bin_names
optional
Context
namespace
Prometheus Name
aerospike_namespace_bin_names
Description

Number of bin names used for the namespace.

The formula for the associated metrics is as follows:

bin_names_quota - bin_names = available_bin_names

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
bin_names_quota
optional
Context
namespace
Prometheus Name
aerospike_namespace_bin_names_quota
Description

Quota of bin names for the namespace. Starting with Database 7.0, there is no limit on bin names per namespace. In Database 5.0 and 6.0, the limit was 65,535.

The formula for the associated metrics is as follows:

bin_names_quota - bin_names = available_bin_names

If you have met the quota, see KB article How to clear up bin names when they exceed the limits.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
cache_read_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_cache_read_pct
Description

Percentage of read commands that are hitting the post-write-cache or the blocks in the max-write-cache and will save an IO to the underlying storage device.

See the post-write-cache and read-page-cache documentation for ways to improve read-intensive workloads latency by leveraging those 2 different caching options.

Reads from update commands as well as migrations, scans, XDR reads and anything that tries to load a record off the device are accounted for in the cache_read_pct figures.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_delete_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_client_delete_error
Description

Number of client delete commands that failed with an error.

Monitoring

Compare client_delete_error to client_delete_success.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_delete_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_delete_filtered_out
Description

Number of client delete commands that did not happen because the record was filtered out with Filter Expression.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_delete_not_found
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_delete_not_found
Description

Number of client delete commands that resulted in a not found.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_delete_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_delete_success
Description

Number of successful client delete commands.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_delete_timeout
Description

Number of client delete commands that timed out.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_lang_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_lang_delete_success
Description

Number of UDF commands that successfully deleted a record.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_lang_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_lang_error
Description

Number of UDF commands that failed with a language (Lua) error during UDF execution.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_lang_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_lang_read_success
Description

Number of successful record reads caused by a UDF command.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_lang_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_lang_write_success
Description

Number of successful record writes caused by a UDF command.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_proxy_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_proxy_complete
Description

Number of client commands proxied to another node.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_proxy_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_proxy_error
Description

Number of client commands that failed to proxy to another node.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_proxy_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_proxy_timeout
Description

Number of client commands that timed out while being proxied to another node.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_read_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_client_read_error
Description

Number of read commands that failed with an error. For example, invalid set name, unavailable (if SC), failure to apply a predexp filter, key mismatch if key was sent), device error (i/o error), key busy (duplicate resolution or if SC), problem during bitwise, HLL or CDT.

Monitoring

Compare client_read_error to client_read_success.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_read_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_read_filtered_out
Description

Number of read commands that did not happen because they were filtered out.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_read_not_found
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_read_not_found
Description

Number of client read commands that resulted in not found.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_read_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_read_success
Description

Number of successful client read commands. Does not include records read by batch-reads or scans. batch-reads have the separate batch_sub_read_success metric. Scans have separate metrics depending on the type of scan between scan_basic_complete, scan_aggr_complete, scan_ops_bg_complete, and scan_udf_bg_complete metrics.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_read_timeout
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_read_timeout
Description

Number of client read commands that timed out.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_tsvc_error
Description

Number of client commands that failed in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this includes commands against unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_tsvc_timeout
Description

Number of client commands that timed out while in the transaction service, before attempting to handle the command. At this stage the commands has not yet been identified as a read or a write, but the namespace is known. Likely cause, there may not be enough service threads to keep pace with the workload. Other common situations falling into this category would be commands that have to be retried after waiting in the rw-hash (for example hotkeys) and use cases where the timeout set by the client is too aggressive.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_udf_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_udf_complete
Description

Number of completed UDF commands initiated by the client.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_udf_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_client_udf_error
Description

Number of failed UDF commands initiated by the client. Does not include timeouts. Error is also returned to the client.

Monitoring

Compare client_udf_error to client_udf_complete.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_udf_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_udf_filtered_out
Description

Number of client UDF commands that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_udf_timeout
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_udf_timeout
Description

Number of UDF commands initiated by the client that timed out. The timeout error is returned to the client.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_write_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_client_write_error
Description

Number of client write commands that failed with an error. Includes common errors like fail_generation, fail_key_busy, fail_record_too_big, fail_xdr_forbidden and some less common errors. Includes xdr_client_write_error. See Why is my client_write_error metrics incrementing? for details on the type of errors that increment this statistic.

Monitoring

Compare client_write_error to client_write_success.

If ratio is higher than acceptable,alert operations to investigate.

For more details, see to the knowledge base article Why is my client_write_error metrics incrementing?.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_write_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_client_write_filtered_out
Description

Number of client write commands that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_write_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_write_success
Description

Number of successful client write commands. Includes xdr_client_write_success.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
client_write_timeout
watch
Context
namespace
Prometheus Name
aerospike_namespace_client_write_timeout
Description

Number of client write commands that timed out on the server. On a stable cluster with no migrations in progress, this metric indicates the number of replica write timeouts. A timeout error is returned to the client. In strong-consistency enabled namespaces, the record is marked as unreplicated and will re-replicate. Includes xdr_client_write_timeout.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The following conditions can cause this metric to increment:

  • Every single write replica failure (master failing to replicate) increments the client_write_timeout metric.

  • If duplicate resolution is enabled for writes (default), during migrations, the client_write_timeout metric also increments if there is a timeout during duplicate resolution and could occur before we apply the write on the master side.

  • See transaction-max-ms for details on when the server checks for timeout. Transactions can also timeout earlier in the transaction flow, in which case, the client_tsvc_timeout statistic increments.

clock_skew_stop_writes
critical
Context
namespace
Prometheus Name
aerospike_namespace_clock_skew_stop_writes
Description

Namespace will stop accepting client writes when true.

For strong-consistency enabled namespaces, will be true if the clock skew is outside of tolerance, typically 20 seconds.

For Available mode (AP) namespaces running Database 4.5.1 or later, and where NSUP is enabled (nsup-period not zero), will be true if the cluster clock skew exceeds 40 seconds. In such occurrences, NSUP will also not run, disabling record expirations and evictions until the clock skew falls back in the tolerated range.

Monitoring

If clock_skew_stop_writes is true, it is a critical ALERT.

Verify that clocks are synchronized across the cluster.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
boolean
Labels
cluster_namejobserviceinstancelongitudelatitudens
current_time
optional
Context
namespace
Prometheus Name
aerospike_namespace_current_time
Description

Current time represented as Aerospike epoch time.

Monitoring

If cluster_max(current_time) and cluster_min(current_time) differ by more than 10 seconds, critical ALERT.

Server time skew might indicate that NTP or similar service is not running on this node.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
data_avail_pct
critical
Context
namespace
Prometheus Name
aerospike_namespace_data_avail_pct
Description

Measures the minimum contiguous storage-engine device, pmem, or memory storage file space across all such files in a namespace. The namespace is read-only if this value falls below stop-writes-avail-pct. It is important for all configured storage files in a namespace to have the same size, otherwise, data_avail_pct could be low even when a lot of space is available across other files.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Example: Where 5 files of 96MiB each for a given namespace, and each file has 24MiB of data spread across 6 write blocks (with the 8MiB write-block size):

  • The data_used_pct is 75%.
  • The data_avail_pct is 50%.
  • If the distribution is not perfectly uniform (which is usual), data_avail_pct represents the file that has the fewest free blocks.

Warn your operations group about any of the following conditions:

  • If data_avail_pct drops below 20%, the defrag may not be able to keep up with the current load.
  • If data_avail_pct drops below 15%, this is a critical ALERT.
  • If data_avail_pct drops below 5%, this condition might result in stop_writes.
data_compression_ratio
watch
Context
namespace
Prometheus Name
aerospike_namespace_data_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size). device_compression_ratio is not included if the compression configuration parameter is set to none.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average calculated based on the most recently written records. Read records do not factor into the ratio. Records that don’t try to compress are not included in the moving average. If the written data changes over time, then the compression ratio changes with it. In case of a sudden change in data, the indicated compression ratio may lag. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.

data_total_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_data_total_bytes
Description

Regardless of storage-engine, the total allocated storage.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
data_used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_data_used_bytes
Description

Regardless of storage-engine, the total storage allocated is data_total_bytes, and the amount of data used in that storage is data_used_bytes.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
data_used_pct
watch
Context
namespace
Prometheus Name
aerospike_namespace_data_used_pct
Description

Percentage of used storage capacity for this namespace. Calculated as data_used_bytes * 100 / data_total_bytes. Evictions will be triggered when this percentage crosses the configured evict-used-pct.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
dead_partitions
critical
Context
namespace
Prometheus Name
aerospike_namespace_dead_partitions
Description

Number of dead partitions for this namespace when using strong-consistency. This is the number of partitions that are unavailable when all roster nodes are present. Requires the use of the revive command to make them available again. Revived nodes restore availability only when all nodes are trusted.

Monitoring

If dead_partitions is not zero, critical ALERT. If you are certain that there are no potential data inconsistencies or if data inconsistencies are acceptable, consider issuing revive and recluster commands.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
deleted_last_bin
optional
Context
namespace
Prometheus Name
aerospike_namespace_deleted_last_bin
Description

Number of objects deleted because their last bin was deleted.

Introduced
3.9.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
device_available_pct
critical
Context
namespace
Prometheus Name
aerospike_namespace_device_available_pct
Description

Measures the minimum contiguous disk space across all devices in a namespace. The namespace will be read only (stop writes) if this value falls below min-avail-pct. It is important for all configured devices in a namespace to have the same size, otherwise, the device_available_pct could be low even when a lot of space is available across other devices.

Monitoring
Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Not to be confused with device_free_pct which represents the amount of free space across all devices in a namespace and does not take account of the fragmentation. Here is an example to represent the difference between device_free_pct and device_available_pct. Assume 5 devices of 100MiB each for a given namespace, where each device has 20MiB of data that are spread across 5 write-blocks (where each write-block is 8MiB):

device_compression_ratio
watch
Context
namespace
Prometheus Name
aerospike_namespace_device_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio. 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size). device_compression_ratio will not be included if compression is set to none.

Introduced
4.5.0.1
Removed
7.0
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average. It is calculated based on the most recently written records. Read records do not factor into the ratio. Records that don’t try to compress are not included in the moving average. If the written data changes over time then the compression ratio will change with it. In case of a sudden change in data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.

device_free_pct
watch
Context
namespace
Prometheus Name
aerospike_namespace_device_free_pct
Description

Percentage of disk capacity free for this namespace. This is the amount of free storage across all devices in the namespace. Evictions will be triggered when the used percentage across all devices (which is represented by 100 - device_free_pct) crosses the configured high-water-disk-pct.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Not to be confused with device_available_pct which represents the amount of free contiguous space on the device that has the least contiguous free space across the namespace. Here is an example to represent the difference between device_free_pct and device_available_pct. Assume 5 devices of 100MB each for a given namespace, where each device has 25MB of data that are spread across 50 write blocks (let’s assume a 1MB write-block-size):

device_total_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_device_total_bytes
Description

Total bytes of disk space allocated to this namespace on this node.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
device_used_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_device_used_bytes
Description

Total bytes of disk space used by this namespace on this node.

Monitoring

Trending device_used_bytes provides operations insight into how disk usage changes over time for this namespace.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
dup_res_ask
optional
Context
namespace
Prometheus Name
aerospike_namespace_dup_res_ask
Description

Number of duplicate resolution requests made by the node to other individual nodes.

Introduced
5.5
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
dup_res_respond_no_read
optional
Context
namespace
Prometheus Name
aerospike_namespace_dup_res_respond_no_read
Description

Number of duplicate resolution requests handled by the node without reading the record.

Introduced
5.5
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
dup_res_respond_read
optional
Context
namespace
Prometheus Name
aerospike_namespace_dup_res_respond_read
Description

Number of duplicate resolution requests handled by the node where the record was read.

Introduced
5.5
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
effective_active_rack
optional
Context
namespace
Prometheus Name
aerospike_namespace_effective_active_rack
Description

The effective active-rack for the namespace. The configured active rack owns all of the master partition copies.

For strong consistency-enabled namespaces, this is the roster’s current active rack. Otherwise, it is the configured active-rack.

Introduced
7.2.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
effective_is_quiesced
optional
Context
namespace
Prometheus Name
aerospike_namespace_effective_is_quiesced
Description

Reports ‘true’ when the namespace has rebalanced after previously receiving a quiesce info request.

Introduced
4.3.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
effective_prefer_uniform_balance
optional
Context
namespace
Prometheus Name
aerospike_namespace_effective_prefer_uniform_balance
Description

Applies only to Enterprise Edition. Value can be true or false. If Aerospike applied the uniform balance algorithm for the current cluster state, the value returned is true. If any node having this namespace isn’t configured with prefer-uniform-balance true, the value returned is false and uniform balance algorithm is disabled for this namespace on all participating nodes.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
effective_replication_factor
optional
Context
namespace
Prometheus Name
aerospike_namespace_effective_replication_factor
Description

The effective replication factor for the namespace, included with the namespace info command metrics.

The effective replication factor is less than the replication-factor if the cluster size is smaller than the RF, in which case the effective replication factor would match the cluster size.

In Database 5.7 and earlier, if the paxos-single-replica-limit size is reached, the effective replication factor is 1.

The effective replication factor is 0 for a node that has been orphaned by the cluster. For example, if a node tries to join a cluster but that node is unable to communicate with every other node in the cluster, the principal node rejects the request and the node marks itself as an orphan.

Introduced
3.15.1.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

For AP namespaces in Database 7.1 and earlier, the effective replication factor drops when a node is shut down or crashes, and the remaining nodes are fewer than the RF. In Database 5.7 and earlier, if the paxos-single-replica-limit size is reached, the effective replication factor is 1.

evict_ttl
optional
Context
namespace
Prometheus Name
aerospike_namespace_evict_ttl
Description

The current eviction depth, or the highest ttl of records that have been evicted, in seconds.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
evict_void_time
optional
Context
namespace
Prometheus Name
aerospike_namespace_evict_void_time
Description

The current eviction depth, expressed as a void time in seconds since 1 January 2010 UTC.

Introduced
4.5.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
evicted_objects
watch
Context
namespace
Prometheus Name
aerospike_namespace_evicted_objects
Description

Number of objects evicted from this namespace on this node since the server started.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
fail_client_lost_conflict
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_client_lost_conflict
Description

Number of non-XDR write commands that failed because some bin’s last-update-time is greater than the write command’s time. Error code 28 is returned. This can happen only when the XDR bin convergence feature is enabled. This can happen due to either:

  • a clock skew across DCs causing XDR write commands to write bins with a future timestamp compared to local time.

  • a race condition between an incoming XDR write command and a local client write command.

See fail_xdr_lost_conflict and cluster_max_compatibility_id.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
fail_generation
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_generation
Description

Number of read/write commands failed on generation check.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
fail_key_busy
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_key_busy
Description

Number of read/write commands that failed on ‘hot keys’, meaning there were already a number of commands queued up higher than transaction-pending-limit for the same record waiting in the rw-hash or rw_in_progress. For read this can only happen when duplicate resolution is necessary.

Monitoring

If the application is not expected to have hot keys and fail_key_busy rate of change exceeds expectations, this condition might indicate a problem with the application.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Detail level logging for the rw context will log transactions (digest) triggering this error. Read transactions would only fail if they had to go through the rw-hash (for example if duplicate resolution are in effect).

fail_mrt_blocked
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_fail_mrt_blocked
Description

Number of transactions or read/write commands blocked by an ongoing transaction.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
fail_mrt_version_mismatch
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_mrt_version_mismatch
Description

Number of version mismatches - usually in verify reads, but also individual commands (reads/writes/deletes/UDFs) where version checks occur if the record had previously been read in the transaction.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
fail_record_too_big
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_record_too_big
Description

Number of write commands that failed because a record was larger than max-record-size. Only counts client writes failures on master side.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Detail level logging for the rw context will log transactions (digest) triggering this error (originating from client side master writes). Enabling detail level logging for the drv_ssd context will log all attempts at writing records that are too big, including replica-writes, immigration (migrations) writes and applying duplicate resolution winners. See “How do I change the write-block-size configuration?” for more information.

fail_xdr_forbidden
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_xdr_forbidden
Description

Number of read/write commands that failed due to configuration restriction. Error code 22 is returned. This counts any of the traffic rejected due to either of the following:

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
fail_xdr_key_busy
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_xdr_key_busy
Description

Number of XDR key-busy errors (code 32) that have occurred. This error is raised if either of the following occurs:

Introduced
7.2
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
fail_xdr_lost_conflict
watch
Context
namespace
Prometheus Name
aerospike_namespace_fail_xdr_lost_conflict
Description

Number of XDR write commands that did not succeed in updating all the attempted bins. Only a subset of bin updates might have failed or all the bin updates might have failed. This can happen only when the XDR bin convergence feature is enabled. If a conflicting write happens on the same record across two or more data centers, the bin with the earlier last update time will lose during XDR shipping. An XDR retry due to a timeout, where a record that has already been successfully updated at a destination is received again, would fail and this metric will be updated. In other retry scenarios, such as key busy or device busy, the remote record will not be updated. Only a timeout-based retry can lead to this situation. See fail_client_lost_conflict.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_delete_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_delete_error
Description

Number of batch-index delete subtransactions proxied from another node that failed with an error.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_delete_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_delete_filtered_out
Description

Number of batch-index delete subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_delete_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_delete_not_found
Description

Number of batch-index delete subtransactions proxied from another node that resulted in not found.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_delete_success
Description

Number of records successfully deleted by batch-index subtransactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_delete_timeout
Description

Number of batch-index delete subtransactions proxied from another node that timed out.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_lang_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_lang_delete_success
Description

Number of successful batch-index UDF delete subtransactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_lang_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_lang_error
Description

Number of language (Lua) batch-index errors for UDF sub-transactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_lang_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_lang_read_success
Description

Number of successful batch-index UDF read subtransactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_lang_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_lang_write_success
Description

Number of successful batch-index UDF write subtransactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_read_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_read_error
Description

Number of batch-index read sub-transactions proxied from another node that failed with an error.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_read_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_read_filtered_out
Description

Number of batch-index read subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_read_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_read_not_found
Description

Number of batch-index read subtransactions proxied from another node that resulted in not found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_read_success
Description

Number of records successfully read by batch-index subtransactions proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_read_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_read_timeout
Description

Number of batch-index read subtransactions proxied from another node that timed out.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_tsvc_error
Description

Number of batch-index read subtransactions proxied from another node that failed with an error in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this will include transactions against unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_tsvc_timeout
Description

Number of batch-index read subtransactions proxied from another node that timed out in the transaction service, before attempting to handle the transaction.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_udf_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_udf_complete
Description

Number of completed batch-index UDF subtransactions proxied from another node for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_udf_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_udf_error
Description

Number of failed batch-index UDF subtransactions proxied from another node for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_udf_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_udf_filtered_out
Description

Number of batch-index UDF subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_udf_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_udf_timeout
Description

Number of batch-index UDF subtransactions proxied from another node that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: from_proxy_batch_sub_lang_delete_success, from_proxy_batch_sub_lang_error, from_proxy_batch_sub_lang_read_success, from_proxy_batch_sub_lang_write_success.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_write_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_write_error
Description

Number of batch-index write subtransactions proxied from another node that failed with an error.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_write_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_write_filtered_out
Description

Number of batch-index write subtransactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_write_success
Description

Number of records successfully written by batch-index subtransactions proxied from another node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_batch_sub_write_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_batch_sub_write_timeout
Description

Number of batch-index write subtransactions proxied from another node that timed out.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_delete_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_delete_error
Description

Number of errors for delete transactions proxied from another node. This includes xdr_from_proxy_delete_error.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_delete_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_delete_filtered_out
Description

Number of delete transactions proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_delete_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_delete_not_found
Description

Number of delete transactions proxied from another node that resulted in not found. This includes xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_delete_success
Description

Number of successful delete transactions proxied from another node. This includes xdr_from_proxy_delete_success.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_delete_timeout
Description

Number of timeouts for delete transactions proxied from another node. This includes xdr_from_proxy_delete_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_lang_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_lang_delete_success
Description

Number of successful UDF delete transactions proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_lang_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_lang_error
Description

Number of language (Lua) errors for UDF transactions proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_lang_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_lang_read_success
Description

Number of successful UDF read commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_lang_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_lang_write_success
Description

Number of successful UDF write commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_read_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_read_error
Description

Number of errors for read commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_read_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_read_filtered_out
Description

Number of read commands proxied from another node that did not happen because they were filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_read_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_read_not_found
Description

Number of read commands proxied from another node that resulted in not found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_read_success
Description

Number of successful read commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_read_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_read_timeout
Description

Number of timeouts for read commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_tsvc_error
Description

Number of commands proxied from another node that failed in the transaction service, before attempting to handle the commands. For example protocol errors or security permission mismatch. In strong-consistency enabled namespaces, this will include commands against unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_tsvc_timeout
Description

Number of commands proxied from another node that timed out while in the transaction service, before attempting to handle the commands. At this stage the commands has not yet been identified as a read or a write, but the namespace is known. There could be congestion in the internal transaction queue, or it could be that the timeout set by the client is too aggressive.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_udf_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_udf_complete
Description

Number of successful UDF commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_udf_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_udf_error
Description

Number of errors for UDF commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_udf_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_udf_filtered_out
Description

Number of UDF commands proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_udf_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_udf_timeout
Description

Number of timeouts for UDF commands proxied from another node.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_write_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_write_error
Description

Number of errors for write commands proxied from another node. This includes xdr_from_proxy_write_error.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_write_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_write_filtered_out
Description

Number of write commands proxied from another node that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_write_success
Description

Number of successful write commands proxied from another node. This includes xdr_from_proxy_write_success.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
from_proxy_write_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_from_proxy_write_timeout
Description

Number of timeouts for write commands proxied from another node. This includes xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
geo_region_query_cells
Context
namespace
Prometheus Name
aerospike_namespace_geo_region_query_cells
Description

Number of cell coverings for query region queried.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
geo_region_query_falsepos
Context
namespace
Prometheus Name
aerospike_namespace_geo_region_query_falsepos
Description

Number of points outside the region. Total query result points is geo_region_query_points + geo_region_query_falsepos.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
geo_region_query_points
Context
namespace
Prometheus Name
aerospike_namespace_geo_region_query_points
Description

Number of points within the region. Total query result points is geo_region_query_points + geo_region_query_falsepos.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
geo_region_query_reqs
Context
namespace
Prometheus Name
aerospike_namespace_geo_region_query_reqs
Description

Number of geo queries on the system since the uptime of the node.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
hwm_breached
critical
Context
namespace
Prometheus Name
aerospike_namespace_hwm_breached
Description

If true, Aerospike has breached ‘high-water-[disk|memory]-pct’ for this namespace.

Monitoring

If hwm_breached is true, alert your operations group that memory or disk resources are strained. This condition might indicate the need to increase cluster capacity.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
boolean
Labels
cluster_namejobserviceinstancelongitudelatitudens
index-type.mount[ix].age
optional
Context
namespace
Prometheus Name
aerospike_namespace_index-type.mount[ix].age
Description

Applies only to Enterprise Edition configured to index-type flash. This shows the percentage of lifetime (total usage) claimed by OEM for underlying device. Value is -1 unless underlying device is NVMe and may exceed 100. ‘ix’ is the device index. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for 2 files specified in the configuration.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
index_flash_alloc_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_index_flash_alloc_bytes
Description

Applies only to Enterprise Edition configured with index-type flash. Total bytes allocated on the mount(s) for the primary index used by this namespace on this node. This statistic represents entire 4KiB chunks which have at least one element in use. Also available in the log on the index-flash-usage ticker entry.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
index_flash_alloc_pct
warn
Context
namespace
Prometheus Name
aerospike_namespace_index_flash_alloc_pct
Description

Applies only to Enterprise Edition configured with index-type flash. Percentage of the mount(s) allocated for the primary index used by this namespace on this node. Prior to Database 7.0, calculated as (index_flash_alloc_bytes / index-type.mounts-size-limit) * 100. In Database 7.0 and later, calculated as (index_flash_alloc_bytes / index-type.mounts-budget) * 100. This statistic represents entire 4KiB chunks which have at least one element in use. Also available in the log on the index-flash-usage ticker entry.

Monitoring

If index_flash_alloc_pct gets close to or greater than 100%, alert operations to review the sizing of the namespace.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
index_flash_used_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_index_flash_used_bytes
Description

Applies only to Enterprise Edition configured with index-type flash. Total bytes in-use on the mount(s) for the primary index used by this namespace on this node. This is the same value memory_used_index_bytes would have if the index were not persisted.

Introduced
4.3
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
index_flash_used_pct
watch
Context
namespace
Prometheus Name
aerospike_namespace_index_flash_used_pct
Description

Applies only to Enterprise Edition configured with index-type flash. Percentage of the mount(s) in-use for the primary index used by this namespace on this node. Calculated as (index_flash_used_bytes / index-type.mounts-size-limit) * 100.

Introduced
4.3
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
index_mounts_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_index_mounts_used_pct
Description

Applies only to Enterprise Edition configured with index-type pmem or flash. Percentage of the mount(s) in-use for the primary index used by this namespace on this node.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
index_pmem_used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_index_pmem_used_bytes
Description

Applies only to Enterprise Edition configured with index-type pmem. Total bytes in-use on the mount(s) for the primary index used by this namespace on this node. This is the same value memory_used_index_bytes would have if the index were not persisted.

Introduced
4.5
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
index_pmem_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_index_pmem_used_pct
Description

Applies only to Enterprise Edition configured with index-type pmem. Percentage of the mount(s) in-use for the primary index used by this namespace on this node. Calculated as (index_pmem_used_bytes / index-type.mounts-size-limit) * 100

Introduced
4.5
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
index_used_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_index_used_bytes
Description

Amount of memory occupied by the primary index for this namespace. Applies to all types of index storage (index-type.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
indexes_memory_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_indexes_memory_used_pct
Description

Combined RAM indexes’ size as a percentage of indexes-memory-budget when indexes-memory-budget is configured nonzero.

Introduced
7.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
Detail
master_tombstones
watch
Context
namespace
Prometheus Name
aerospike_namespace_master_tombstones
Description

Number of tombstones on this node which are active masters.

Introduced
3.10
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
max-evicted-ttl
optional
Context
namespace
Prometheus Name
aerospike_namespace_max-evicted-ttl
Description

The highest record TTL that Aerospike has evicted from this namespace.

Introduced
-
Removed
Yes
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
max_void_time
optional
Context
namespace
Prometheus Name
aerospike_namespace_max_void_time
Description

Maximum record TTL ever inserted into this namespace.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
memory_free_pct
critical
Context
namespace
Prometheus Name
aerospike_namespace_memory_free_pct
Description

Percentage of memory capacity free for this namespace.

Monitoring

If memory_free_pct approaches the configured value for high-water-memory-pct or stop-writes-pct, alert operations to investigate the cause. Might indicate a need to reduce the object count or increase capacity and may require further investigation into memory_used_sindex_bytes if secondary indexes are in use, into memory_used_set_index_bytes if set indexes are used, or into heap_efficiency_pct if data is stored in memory.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
memory_used_bytes
warn
Context
namespace
Prometheus Name
aerospike_namespace_memory_used_bytes
Description

Total bytes of memory used by this namespace on this node. Used against the [high-water-memory-pct](/database/reference/config#namespace__high-water-memory-p\ ct) and stop-writes-pct thresholds. It represents the sum of the following values:
memory_used_data_bytes
memory_used_index_bytes
memory_used_set_index_bytes (Database 5.6 and later)
memory_used_sindex_bytes

See heap_allocated_kbytes for the total amount of memory allocated on a node other than primary index shared memory in Enterprise Edition and, for Database 6.1 and later, secondary index shared memory in Enterprise Edition.

Monitoring

Trending used-bytes-memory provides operations insight into how memory usage changes over time for this namespace.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
memory_used_data_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_memory_used_data_bytes
Description

Amount of memory occupied by data. See memory_used_bytes for the total memory accounted for the namespace.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
memory_used_index_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_memory_used_index_bytes
Description

Amount of memory occupied by the index for this namespace. Allocated in shared memory by default (index-type shmem) for the Enterprise Edition.
If your index is persisted, either in block storage (index-type flash, or in persistent memory (index-type pmem, (Database 4.5 and later), refer instead to index_flash_used_bytes or index_pmem_used_bytes. For these persisted index configurations, the value of memory_used_index_bytes is 0.

See memory_used_bytes for the total memory accounted for the namespace.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
memory_used_set_index_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_memory_used_set_index_bytes
Description

Amount of memory occupied by set indexes for this namespace on this node. See memory_used_bytes for the total memory accounted for the namespace.

Introduced
5.6
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
memory_used_sindex_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_memory_used_sindex_bytes
Description

Amount of memory occupied by secondary indexes for this namespace on this node. See memory_used_bytes for the total memory accounted for the namespace.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
migrate_fresh_partitions
watch
Context
namespace
Prometheus Name
aerospike_namespace_migrate_fresh_partitions
Description

Number of partitions that are created fresh or empty because a number of nodes, greater than the replication factor, have left the cluster. Applies to AP and SC namespaces.

Introduced
7.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
migrate_record_receives
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_record_receives
Description

Number of record insert request received by immigration.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_record_retransmits
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_record_retransmits
Description

Number of times emigration has retransmitted records.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

migrate_records_skipped
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_records_skipped
Description

Number of times emigration did not ship a record because the remote node was already up-to-date.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_records_transmitted
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_records_transmitted
Description

Number of records emigration has read and sent.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_records_unreadable
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_records_unreadable
Description

Number of records skipped during migration because they were unreadable when migrate-skip-unreadable is enabled.

Introduced
7.0.0.18, 7.1.0.9, 7.2.0.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_rx_instance_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_rx_instance_count
Description

Number of instance objects managing immigrations.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_rx_partitions_active
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_rx_partitions_active
Description

Number of partitions currently immigrating to this node. If migrate_rx_partitions_active is greater than 0 and cluster is not in maintenance, Operations needs to identify why migrations are running.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_rx_partitions_initial
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_rx_partitions_initial
Description

Total number of migrations this node will receive during the current migration cycle for this namespace.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_rx_partitions_remaining
watch
Context
namespace
Prometheus Name
aerospike_namespace_migrate_rx_partitions_remaining
Description

Number of migrations this node has not yet received during the current migration cycle for this namespace.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_signals_active
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_signals_active
Description

For finished partition migrations on this node, number of outstanding clean-up signals, sent to participating member nodes, waiting for clean-up acknowledgment. Signals are messages that are sent from a partition’s master node to all other nodes that currently have data for the partition. The signals are used to notify all nodes that migrations have completed for this partitions and if they aren’t a replica they can now drop the partition.

Introduced
3.13.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_signals_remaining
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_signals_remaining
Description

For unfinished partition migrations on this node, number of clean-up signals to send to participating member nodes, as migration completes. Signals are messages that are sent from a partition’s master node to all other nodes that currently have data for the partition. The signals are used to notify all nodes that migrations have completed for this partitions and if they aren’t a replica they can now drop the partition.

Introduced
3.13.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_instance_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_instance_count
Description

Number of instance objects managing emigrations.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_partitions_active
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_partitions_active
Description

Number of partitions currently emigrating from this node. If migrate_tx_partitions_active is greater than 0 and cluster is not in maintenance, Operations needs to identify why migrations are running.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_partitions_imbalance
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_partitions_imbalance
Description

Number of partition migrations failures which could lead to partitions being imbalanced. For each increment there will also be a warning logged.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_partitions_initial
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_partitions_initial
Description

Total number of migrations this node will send during the current migration cycle for this namespace.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_partitions_lead_remaining
optional
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_partitions_lead_remaining
Description

Number of initially scheduled emigrations which are not delayed by the migrate-fill-delay configuration. Lead migrations are typically delta-migrations addressing non-empty partition replica nodes. Delta-migrations generally consume far less storage IO.

Introduced
4.3.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
migrate_tx_partitions_remaining
watch
Context
namespace
Prometheus Name
aerospike_namespace_migrate_tx_partitions_remaining
Description

Number of migrations this node not yet sent during the current migration cycle for this namespace.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
mrt_monitor_roll_back_error
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_back_error
Description

Subset of mrt_roll_back_error where monitor did the roll back.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_back_success
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_back_success
Description

Subset of mrt_roll_back_success where monitor did the roll back.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_back_timeout
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_back_timeout
Description

Subset of mrt_roll_back_timeout where monitor did the roll back.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_forward_error
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_forward_error
Description

Subset of mrt_roll_forward_error where monitor did the roll forward.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_forward_success
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_forward_success
Description

Subset of mrt_roll_forward_success where monitor did the roll forward.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_forward_timeout
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_forward_timeout
Description

Subset of mrt_roll_forward_timeout where monitor did the roll forward.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitor_roll_tombstone_creates
enterpriseoptional
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitor_roll_tombstone_creates
Description

Number of times monitor transactions rolls (forward or back) generate tombstones from nothing – this is rare but normal.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_monitors_active
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_monitors_active
Description

Number of transactions currently being driven by a monitor.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_provisionals
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_provisionals
Description

Number of provisional records in a transaction.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_back_error
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_back_error
Description

Number of roll back transactions that failed.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_back_success
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_back_success
Description

Number of roll back transactions that succeeded.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_back_timeout
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_back_timeout
Description

Number of roll back transactions that timed out.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_forward_error
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_forward_error
Description

Number of roll forward transactions that failed.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_forward_success
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_forward_success
Description

Number of roll forward transactions that succeeded.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_roll_forward_timeout
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_roll_forward_timeout
Description

Number of roll forward transactions that timed out.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_verify_read_error
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_verify_read_error
Description

Number of verify read commands that failed.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_verify_read_success
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_verify_read_success
Description

Number of verify read commands that succeeded

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
mrt_verify_read_timeout
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_mrt_verify_read_timeout
Description

Number of verify read commands that timed out.

Introduced
8.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancens
nodes_quiesced
optional
Context
namespace
Prometheus Name
aerospike_namespace_nodes_quiesced
Description

The number of nodes observed to be quiesced as of the most recent reclustering event. If a single node received the quiesce command, on the subsequent reclustering event, all nodes return 1 for this metric, and when the quiesced node is shutdown, triggering a new reclustering event, this metric returns to 0.

Introduced
4.4
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
non_expirable_objects
optional
Context
namespace
Prometheus Name
aerospike_namespace_non_expirable_objects
Description

Number of records in this namespace with non-expirable TTLs (TTLs of value 0).

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
non_replica_objects
watch
Context
namespace
Prometheus Name
aerospike_namespace_non_replica_objects
Description

Number of records on this node which are neither master nor replicas. This number is non-zero during migration, representing additional versions or copies of records. Those are records beyond the replication factor line and would be potentially used during migrations to duplicate resolve. This is not true for quiesced nodes, which retain their partitions after migrations have completed.

Introduced
3.13
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
non_replica_tombstones
watch
Context
namespace
Prometheus Name
aerospike_namespace_non_replica_tombstones
Description

Number of tombstones on this node which are neither master nor replicas. This number is non-zero only during migration. This is not true for quiesced nodes, which retain their partitions after migrations have completed.

Introduced
3.13
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
nsup_cycle_deleted_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_nsup_cycle_deleted_pct
Description

Percent of records removed by NSUP in its last cycle.

Introduced
6.3
Removed
-
Measurement type
gauge
Data type
float
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

nsup_cycle_deleted_pct is calculated when the NSUP (Namespace SUPervisor) cycle finishes (nsup-done is logged). It is calculated based on the total objects present at the beginning of the NSUP cycle and the number of objects that got deleted in that cycle (nsup_cycle_deleted_pct = (objects removed by NSUP in its last cycle * 100) / number of total objects when the NSUP cycle started [expirable + non expirable]).

The calculation was different in older versions (it was changed in versions 6.3.0.21, 6.4.0.15, 7.0.0.8 and 7.1.0.0). In those older versions, nsup_cycle_deleted_pct was calculated based on the total objects present after the NSUP cycle finished and the number of objects that got deleted in that cycle.

This led to 2 special cases when its value turned up to 100:

  • When the number of objects is 0 after the NSUP cycle is finished, i.e., all objects get deleted, OR the number of objects deleted in the cycle is greater than or equal to the number of objects left.
  • If NSUP is enabled for a namespace, i.e. nsup-period is greater than 0, and there is 0 record in the namespace.
nsup_cycle_duration
optional
Context
namespace
Prometheus Name
aerospike_namespace_nsup_cycle_duration
Description

Length of the last NSUP cycle in seconds.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
nsup_xdr_key_busy
watch
Context
namespace
Prometheus Name
aerospike_namespace_nsup_xdr_key_busy
Description

Number of NSUP deletes (expirations and evictions) that had to wait for a previous version to ship. This error is raised if either of the following occurs:

  • ship-versions-policy is all and the most recent update to the record has not yet successfully shipped to the destination.
  • ship-versions-policy is interval and XDR hasn’t successfully shipped at least one version of the record in the most recent ship-versions-interval in seconds.
Introduced
7.2
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
objects
watch
Context
namespace
Prometheus Name
aerospike_namespace_objects
Description

Number of records in this namespace for this node. Includes non-replica. Does not include tombstones.

Monitoring

Trending objects provides operations insight into this namespace’s record fluctuations over time.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_tsvc_error
Description

Number of times a background query operate command failed to access a record. For example, due to protocol or permission errors. Does not include timeouts. In strong-consistency enabled namespaces, this includes attempts to access records in unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_tsvc_timeout
Description

Number of records accessed by a background query operate command that timed out in the transaction service.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_write_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_write_error
Description

Number of records accessed by a background query operate command write subtransactions that failed with an error. Does not include timeouts.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_write_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_write_filtered_out
Description

Number of records accessed by a background query operate command write subtransactions for which the write did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_write_success
Description

Number of successful records accessed by a background query operate command write subtransactions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ops_sub_write_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_ops_sub_write_timeout
Description

Number of records accessed by a background query operate command write subtransactions that timed out.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pending_quiesce
optional
Context
namespace
Prometheus Name
aerospike_namespace_pending_quiesce
Description

Reports ‘true’ when the quiesce info command has been received by a node, or if stay-quiesced is true for the node. When true, the next clustering event will cause this node to quiesce. To trigger a clustering event, issue the recluster info command. To disable, issue the quiesce-undo info command.

Introduced
4.3.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_aggr_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_aggr_abort
Description

Number of primary index query aggregations that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_aggr_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_aggr_complete
Description

Number of primary index query aggregations that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_aggr_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_aggr_error
Description

Number of primary index query aggregations that failed.

Monitoring

Compare pi_query_aggr_error to pi_query_aggr_complete.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_long_basic_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_long_basic_abort
Description

Number of basic long primary index queries that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_long_basic_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_long_basic_complete
Description

Number of basic long primary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_long_basic_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_long_basic_error
Description

Number of basic long primary index queries that failed.

Monitoring

Compare pi_query_long_basic_error to pi_query_long_basic_complete.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_ops_bg_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_ops_bg_abort
Description

Number of ops background primary index queries that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_ops_bg_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_ops_bg_complete
Description

Number of ops background primary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_ops_bg_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_ops_bg_error
Description

Number of ops background primary index queries that failed.

Monitoring

Compare pi_query_ops_bg_error to pi_query_ops_bg_complete and If ratio is higher than acceptable, alert operations to investigate.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_short_basic_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_short_basic_complete
Description

Number of basic short primary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_short_basic_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_short_basic_error
Description

Number of basic short primary index queries that failed.

Monitoring

Compare pi_query_short_basic_error to pi_query_short_basic_complete.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_short_basic_timeout
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_short_basic_timeout
Description

Short primary index queries are not monitored, so they cannot be aborted. They might time out, which is reflected in this statistic.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_udf_bg_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_udf_bg_abort
Description

Number of UDF background primary index queries that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_udf_bg_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_udf_bg_complete
Description

Number of UDF background primary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pi_query_udf_bg_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_pi_query_udf_bg_error
Description

Number of UDF background queries that failed.

Monitoring

Compare pi_query_udf_bg_error to pi_query_udf_bg_complete.

If ratio is higher than acceptable, alert operations to investigate.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
pmem_available_pct
critical
Context
namespace
Prometheus Name
aerospike_namespace_pmem_available_pct
Description

Measures the minimum contiguous pmem storage file space across all such files in a namespace. The namespace will be read only (stop writes) if this value falls below min-avail-pct. It is important for all configured pmem storage files in a namespace to have the same size, otherwise, the pmem_available_pct could be low even when a lot of space is available across other files.

Monitoring

If pmem_available_pct drops below 20%, warn your operations group.

This condition might indicate that defrag is unable to keep up with the current load.

If pmem_available_pct drops below 15%, critical ALERT.

If pmem_available_pct drops below 5%, usable PMem resources are critically low. This condition might result in stop_writes.

Introduced
4.8
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Not to be confused with pmem_free_pct which represents the amount of free space across all PMem storage files in a namespace and does not take account of the fragmentation.
Here is an example to represent the difference between pmem_free_pct and pmem_available_pct. Assume 5 files of 96MiB each for a given namespace, where each file has 24MiB of data that are spread across 6 write-blocks (with the 8MiB write-block-size):
- The pmem_free_pct would be 75%. - The pmem_available_pct would be 50%. - If the distribution is not uniform (it usually is not perfectly uniform) the pmem_available_pct would represent the file that has the least free blocks.

pmem_compression_ratio
watch
Context
namespace
Prometheus Name
aerospike_namespace_pmem_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio for PMem storage. 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size). pmem_compression_ratio is not included if the compression configuration parameter is set to none.

Introduced
4.8
Removed
7.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average, calculated based on the most recently written records. Read records do not factor into the ratio. If the written data changes over time then the compression ratio will change with it. In case of a sudden change in data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recently written 100,000 to 1,000,000 records.

pmem_free_pct
watch
Context
namespace
Prometheus Name
aerospike_namespace_pmem_free_pct
Description

Percentage of pmem storage capacity free for this namespace. This is the amount of free storage across all pmem storage files in the namespace. Evictions will be triggered when the used percentage across all storage files (which is represented by 100 - pmem_free_pct) crosses the configured high-water-disk-pct.

Introduced
4.8
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Not to be confused with pmem_available_pct which represents the amount of free contiguous space on the PMem storage file that has the least contiguous free space across the namespace.
Here is an example to represent the difference between pmem_free_pct and pmem_available_pct. Assume 5 files of 96MiB each for a given namespace, where each file has 24MiB of data that are spread across 6 write-blocks (with the 8MiB write-block size):
- The pmem_free_pct would be 75%. - The pmem_available_pct would be 50%. - If the distribution is not uniform (it usually is not perfectly uniform) the pmem_available_pct would represent the file that has the least free blocks.

pmem_total_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_pmem_total_bytes
Description

Total bytes of pmem storage file space allocated to this namespace on this node.

Introduced
4.8
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
pmem_used_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_pmem_used_bytes
Description

Total bytes of pmem storage file space used by this namespace on this node.

Monitoring

Trending pmem_used_bytes provides operations insight into how pmem storage usage changes over time for this namespace.

Introduced
4.8
Removed
7.0
Measurement type
gauge
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
prole_objects
watch
Context
namespace
Prometheus Name
aerospike_namespace_prole_objects
Description

Number of records on this node which are proles (replicas). Does not include tombstones.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
prole_tombstones
watch
Context
namespace
Prometheus Name
aerospike_namespace_prole_tombstones
Description

Number of tombstones on this node which are proles (replicas) on this node.

Introduced
3.10
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_agg
Description

Number of query aggregations attempted. Removed in Database 5.7. Use query_aggr_complete + query_aggr_error + query_aggr_abort instead.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_agg_abort
Description

Number of query aggregations aborted by the user seen by this node. Renamed to query_aggr_abort in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_avg_rec_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_agg_avg_rec_count
Description

Average number of records returned by the aggregations underlying query. Renamed to query_aggr_avg_rec_count in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_error
Context
namespace
Prometheus Name
aerospike_namespace_query_agg_error
Description

Number of query aggregations errors due to an internal error. Renamed to query_aggr_error in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_agg_success
Description

Number of query aggregations completed. Renamed to query_aggr_complete in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_aggr_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_aggr_abort
Description

Number of query aggregations aborted by the user seen by this node. Removed in Database 6.0, use si_query_aggr_abort.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_aggr_avg_rec_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_aggr_avg_rec_count
Description

Average number of records returned by the aggregations underlying query.

Introduced
5.7
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_aggr_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_aggr_complete
Description

Number of query aggregations completed. Removed in Database 6.0, use si_query_aggr_complete.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_aggr_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_aggr_error
Description

Number of query aggregation errors due to an internal error. Removed in Database 6.0, use si_query_aggr_error.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_basic_abort
Description

Number of secondary index basic queries that were aborted by a user. Removed in Database 6.0, use si_query_long_basic_abort.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_avg_rec_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_basic_avg_rec_count
Description

Average number of records returned by all secondary index basic queries.

Introduced
5.7
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_basic_complete
Description

Number of secondary index basic queries which completed successfully.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_basic_error
Description

Number of secondary index basic queries that returned an error. Removed in Database 6.0, use si_query_long_basic_error.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_fail
watch
Context
namespace
Prometheus Name
aerospike_namespace_query_fail
Description

Number of queries which failed due to an internal error. Those are failures not part of query lookup (see query_lookup_error), query aggregation (see query_agg_error) or query background UDF (see query_udf_bg_failure).

Introduced
3.9
Removed
6.0
Measurement type
counter
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_false_positives
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_false_positives
Description

Number of entries that were shortlisted from the secondary index but the bin values are not matching the query clause. This might happen when the bin value changes during query execution.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_long_queue_full
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_long_queue_full
Description

Number of long running queries queue full errors.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_long_reqs
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_long_reqs
Description

Number of long running queries currently in process.

Introduced
3.9
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_lookup_abort
Description

Number of user aborted secondary index queries. Renamed to query_basic_abort in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_avg_rec_count
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_lookup_avg_rec_count
Description

Average number of records returned by all secondary index query look-ups. Renamed to query_basic_avg_rec_count in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_lookup_error
Description

Number of secondary index query look-up errors. Renamed to query_basic_error in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_lookup_success
Description

Number of secondary index look-ups which succeeded. Renamed to query_basic_complete in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookups
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_lookups
Description

Number of secondary index lookups attempted. Removed in Database 5.7. Use query_basic_complete + query_basic_error + query_basic_abort instead.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_ops_bg_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_ops_bg_abort
Description

Number of ops background queries that were aborted. Removed in Database 6.0, use si_query_ops_bg_abort.

Introduced
5.7
Removed
6.0:
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_ops_bg_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_ops_bg_complete
Description

Number of ops background queries that completed. Removed in Database 6.0, use si_query_ops_bg_complete.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_ops_bg_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_ops_bg_error
Description

Number of ops background queries that returned error. Removed in Database 6.0, use si_query_ops_bg_error.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_ops_bg_failure
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_ops_bg_failure
Description

Number of ops background queries that failed. Removed from Database 5.7 and later, use query_ops_bg_error + query_ops_bg_abort instead.

Introduced
4.7
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_ops_bg_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_ops_bg_success
Description

Number of ops background queries that completed. Renamed to query_ops_bg_complete in Database 5.7.

Introduced
4.7
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_proto_compression_ratio
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_proto_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio for protocol message data in query responses to the client. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).

Introduced
4.8
Removed
-
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.

query_proto_uncompressed_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_proto_uncompressed_pct
Description

Measures the percentage of query responses to the client with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).

Introduced
4.8
Removed
-
Measurement type
gauge
Data type
instantaneous
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.

query_reqs
watch
Context
namespace
Prometheus Name
aerospike_namespace_query_reqs
Description

Number of query requests ever attempted on this node. Even very early failures would be counted here, as opposed to query_short_running and query_long_running which would increment a bit later.

Introduced
3.9
Removed
6.0
Measurement type
counter
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_short_queue_full
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_short_queue_full
Description

Number of short running queries queue full errors.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_short_reqs
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_short_reqs
Description

Number of short running queries currently in process.

Introduced
3.9
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_udf_bg_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_udf_bg_abort
Description

Number of UDF background queries that were aborted. Removed in Database 6.0, use si_query_udf_bg_abort.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_udf_bg_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_udf_bg_complete
Description

Number of UDF background queries that completed. Removed in Database 6.0, use si_query_udf_bg_complete.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_udf_bg_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_udf_bg_error
Description

Number of UDF background queries which returned error. Removed in Database 6.0, use si_query_udf_bg_error.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_udf_bg_failure
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_udf_bg_failure
Description

Number of UDF background queries that failed. Removed from Database 5.7 and later, use query_udf_bg_error + query_udf_bg_abort instead.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_udf_bg_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_query_udf_bg_success
Description

Number of UDF background queries that completed. Renamed to query_udf_bg_complete in Database 5.7.

Introduced
3.9
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
re_repl_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_re_repl_error
Description

Number of re-replication errors which were not timeout. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
re_repl_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_re_repl_success
Description

Number of successful re-replications. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
re_repl_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_re_repl_timeout
Description

Number of re-replications that ended in timeout. Re-replications would happen for namespaces operating under the strong-consistency mode when a record does not successfully replicate on the initial attempt. Starting with Database 6.3 this stat only counts timeouts that happened during the actual re-replication.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The transaction-ttl of a re-replication is 1 second by default (configurable through the transaction-max-ms configuration parameter.

re_repl_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_re_repl_tsvc_error
Description

Number of re-replication errors happening in the transaction queue which were not re_repl_tsvc_timeout (before the re-replication attempt). Re-replications occur for namespaces operating under strong-consistency mode when a record does not successfully replicate on the initial attempt.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
6.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
re_repl_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_re_repl_tsvc_timeout
Description

Number of re-replications that time out early in the internal transaction queue, while waiting to be picked up by a service thread. Re-replications occur for namespaces operating under strong-consistency mode when a record does not successfully replicate on the initial attempt.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
6.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
record_proto_compression_ratio
optional
Context
namespace
Prometheus Name
aerospike_namespace_record_proto_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio for protocol message data in single-record transaction client responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).

Introduced
4.8
Removed
-
Measurement type
gauge
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.

record_proto_uncompressed_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_record_proto_uncompressed_pct
Description

Measures the percentage of single-record transaction client responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).

Introduced
4.8
Removed
-
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.

retransmit_all_batch_sub_delete_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_delete_dup_res
Description

Number of retransmits that occurred during batch delete subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_delete_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_delete_repl_write
Description

Number of retransmits that occurred during batch delete subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

:Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_dup_res
Description

Obsolete as of Database 6.0. In case of a failure to replicate a write transaction across all replicas, the record will be left in the ‘un-replicated’ state, forcing a ‘re-replication’ transaction prior to any subsequent read or write transaction on the record.

Number of retransmits that occurred during batch subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Starting with Database 6.0 when batch-writes were introduced, “repl-write retransmits” for batch writes are counted as “dup-res retransmits” which are included in the metric retransmit_all_batch_sub_dup_res.

retransmit_all_batch_sub_read_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_read_dup_res
Description

Number of retransmits that occurred during batch read subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_read_repl_ping
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_read_repl_ping
Description

Number of retransmits that occurred during SC linearized read subtransactions within batched commands. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_udf_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_udf_dup_res
Description

Number of retransmits that occurred during batch UDF subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_udf_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_udf_repl_write
Description

Number of retransmits that occurred during batch UDF subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.

Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_write_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_write_dup_res
Description

Number of retransmits that occurred during batch write subtransactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_batch_sub_write_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_batch_sub_write_repl_write
Description

Number of retransmits that occurred during batch write (insert/update/upsert/replace) subtransactions that were being replica-written. Includes retransmits originating on the client as well as proxying nodes.

Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_delete_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_delete_dup_res
Description

Number of retransmits that occurred during delete transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_delete_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_delete_repl_write
Description

Number of retransmits that occurred during delete transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_read_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_read_dup_res
Description

Number of retransmits that occurred during read commands that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_read_repl_ping
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_read_repl_ping
Description

Number of retransmits that occurred during SC linearized reads. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_udf_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_udf_dup_res
Description

Number of retransmits that occurred during client initiated UDF transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_udf_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_udf_repl_write
Description

Number of retransmits that occurred during client initiated UDF transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_write_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_write_dup_res
Description

Number of retransmits that occurred during write transactions that were being duplicate-resolved. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_all_write_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_all_write_repl_write
Description

Number of retransmits that occurred during write transactions that were being replica written. Includes retransmits originating on the client as well as proxying nodes.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_nsup_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_nsup_repl_write
Description

Number of retransmits that occurred during NSUP initiated delete transactions that were being replica written.

Introduced
3.10.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_ops_sub_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_ops_sub_dup_res
Description

Number of retransmits that occurred during write subtransactions of background ops scan/query jobs that were being duplicate-resolved.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_ops_sub_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_ops_sub_repl_write
Description

Number of retransmits that occurred during write subtransactions of background ops scan/query jobs that were being replica written.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_udf_sub_dup_res
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_udf_sub_dup_res
Description

Number of retransmits that occurred during UDF subtransactions of scan/query background UDF jobs that were being duplicate-resolved.

Introduced
3.10.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

retransmit_udf_sub_repl_write
optional
Context
namespace
Prometheus Name
aerospike_namespace_retransmit_udf_sub_repl_write
Description

Number of retransmits that occurred during UDF subtransactions of scan/query background UDF jobs that were being replica written.

Introduced
3.10.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

Retransmission statistics are collected in the retransmits ticker log line.

scan_aggr_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_aggr_abort
Description

Number of scan aggregations that were aborted. Removed in Database 6.0, use pi_query_aggr_abort.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_aggr_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_aggr_complete
Description

Number of scan aggregations that completed. Removed in Database 6.0, use pi_query_aggr_complete.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_aggr_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_scan_aggr_error
Description

Number of scan aggregations that failed.

Monitoring

Compare scan_aggr_error to scan_aggr_complete.

If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_aggr_error.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_basic_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_basic_abort
Description

Number of basic scans that were aborted. Removed in Database 6.0, use pi_query_long_basic_abort.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_basic_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_basic_complete
Description

Number of basic scans that completed. Removed in Database 6.0, use pi_query_long_basic_complete.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_basic_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_scan_basic_error
Description

Number of basic scans that failed.

Monitoring

Compare scan_basic_error to scan_basic_complete.

If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_long_basic_error.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_ops_bg_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_ops_bg_abort
Description

Number of ops background scans that were aborted. Removed in Database 6.0, use pi_query_ops_bg_abort.

Introduced
4.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_ops_bg_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_ops_bg_complete
Description

Number of ops background scans that completed. Removed in Database 6.0, use pi_query_ops_bg_complete.

Introduced
4.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_ops_bg_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_scan_ops_bg_error
Description

Number of ops background scans that failed.

Monitoring

Compare scan_ops_bg_error to scan_ops_bg_complete and If ratio is higher than acceptable alert operations to investigate. Removed in Database 6.0, use pi_query_ops_bg_error.

Introduced
4.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_proto_compression_ratio
optional
Context
namespace
Prometheus Name
aerospike_namespace_scan_proto_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio for protocol message data in basic scan or aggregation scan client responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).

Introduced
4.8
Removed
6.0
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.

scan_proto_uncompressed_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_scan_proto_uncompressed_pct
Description

Measures the percentage of basic scan or aggregation scan client responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).

Introduced
4.8
Removed
6.0
Measurement type
gauge
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.

scan_udf_bg_abort
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_udf_bg_abort
Description

Number of UDF background scans that were aborted. Removed in Database 6.0, use pi_query_udf_bg_abort.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_udf_bg_complete
watch
Context
namespace
Prometheus Name
aerospike_namespace_scan_udf_bg_complete
Description

Number of UDF background scans that completed. Removed in Database 6.0, use pi_query_udf_bg_complete.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
scan_udf_bg_error
warn
Context
namespace
Prometheus Name
aerospike_namespace_scan_udf_bg_error
Description

Number of UDF background scans that failed.

Monitoring

Compare scan_udf_bg_error to scan_udf_bg_complete.

If ratio is higher than acceptable, alert operations to investigate. Removed in Database 6.0, use pi_query_udf_bg_error.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
set-evicted-objects
optional
Context
namespace
Prometheus Name
aerospike_namespace_set-evicted-objects
Description

Number of records evicted by a set.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
set_index_used_bytes
watch
Context
namespace
Prometheus Name
aerospike_namespace_set_index_used_bytes
Description

Amount of memory occupied by set indexes for this namespace on this node. See Finding total namespace memory for the total memory accounted for the namespace.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_aggr_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_aggr_abort
Description

Number of secondary index query aggregations aborted by the user seen by this node.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_aggr_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_aggr_complete
Description

Number of secondary index query aggregations completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_aggr_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_aggr_error
Description

Number of secondary index query aggregation errors due to an internal error.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_ops_bg_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_ops_bg_abort
Description

Number of ops background secondary index queries that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_ops_bg_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_ops_bg_complete
Description

Number of ops background secondary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_ops_bg_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_ops_bg_error
Description

Number of ops background secondary index queries that returned error.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_udf_bg_abort
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_udf_bg_abort
Description

Number of UDF background secondary index queries that were aborted.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_udf_bg_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_udf_bg_complete
Description

Number of UDF background secondary index queries that completed.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_udf_bg_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_si_query_udf_bg_error
Description

Number of UDF background secondary index queries which returned error.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
sindex-type.mount[ix].age
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex-type.mount[ix].age
Description

Applies only to Enterprise Edition configured to sindex-type flash. This shows the percentage of lifetime (total usage) claimed by OEM for underlying device. Value is -1 unless underlying device is NVMe and may exceed 100. ‘ix’ is the device index. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for 2 files specified in the configuration.

Introduced
6.4
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
sindex_flash_used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_flash_used_bytes
Description

Applies only to Enterprise Edition configured with sindex-type flash. Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node. This is the same value memory_used_sindex_bytes would have if the secondary indexes were not persisted.

Introduced
6.4
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
sindex_flash_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_flash_used_pct
Description

Applies only to Enterprise Edition configured with sindex-type flash. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_pmem_used_bytes / sindex-type.mounts-size-limit) * 100

Introduced
6.4
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
sindex_gc_cleaned
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_gc_cleaned
Description

Number of secondary index entries cleaned by sindex GC.

Introduced
5.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
sindex_mounts_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_mounts_used_pct
Description

Applies only to Enterprise Edition configured with sindex-type pmem or flash. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_used_bytes / sindex-type.mounts-budget) * 100

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
sindex_pmem_used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_pmem_used_bytes
Description

Applies only to Enterprise Edition configured with sindex-type pmem. Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node. This is the same value memory_used_sindex_bytes would have if the secondary indexes were not persisted.

Introduced
6.3
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
sindex_pmem_used_pct
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_pmem_used_pct
Description

Applies only to Enterprise Edition configured with sindex-type pmem. Percentage of the mount(s) in-use for the secondary indexes used by this namespace on this node. Calculated as (sindex_pmem_used_bytes / sindex-type.mounts-size-limit) * 100

Introduced
6.3
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
sindex_used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_sindex_used_bytes
Description

Total bytes in-use on the mount(s) for the secondary indexes used by this namespace on this node.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
smd_evict_void_time
optional
Context
namespace
Prometheus Name
aerospike_namespace_smd_evict_void_time
Description

The cluster-wide specified eviction depth, expressed as a void time in seconds since 1 January 2010 UTC. This is distributed to all nodes via SMD. This may be larger than evict_void_time — evict_void_time will eventually advance to this value.

Introduced
4.5.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
stop_writes
critical
Context
namespace
Prometheus Name
aerospike_namespace_stop_writes
Description

If true, this namespace is currently not allowing client-originated writes. Migration writes and prole writes are still allowed. Error code 22 is returned if any one of the following are breached: Prior to Database 7.0:

Monitoring

If stop-writes is true, critical ALERT.

Until the cause is corrected, the system will reject all writes.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].age
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_age
Description

Shows percentage of lifetime (total usage) claimed by OEM for underlying storage-engine.device[ix] (may exceed 100). Value will be -1 unless underlying device is NVMe. It is a measure of how much of the drive’s projected lifetime according to the manufacturer has been used at any point in time. When the SSD is brand new, its value will report ‘0’ and when its projected lifetime has been reached, it shows ‘100’, reporting that 100% of the projected lifetime has been used. When the value gets over 100%, the SSD has reached the lifetime specified by the OEM.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].defrag_partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_defrag_partial_writes
Description

The number of wblocks partial flushed to storage-engine.device[ix] by defrag.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.device[ix].defrag_q
warn
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_defrag_q
Description

Number of wblocks queued to be defragged on [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]).

Monitoring

Measured per-device or per-file depending on the storage configuration.

If storage-engine.device[ix].defrag_q or storage-engine.file[ix].defrag_q continues to increase over time, alert operations to investigate.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].defrag_reads
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_defrag_reads
Description

The number of wblocks that have been sent to the defrag_q from storage-engine.device[ix]. Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].defrag_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_defrag_writes
Description

The number of wblocks defrag has written to [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]).

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].free_wblocks
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_free_wblocks
Description

The number of wblocks (write blocks) free on [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]).

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_partial_writes
Description

The number of wblocks partial flushed to [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]) by writes.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.device[ix].read_errors
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_read_errors
Description

Number of read errors encountered on storage-engine.device[ix].

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.device[ix].shadow_write_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_shadow_write_q
Description

The number of wblocks queued to be written to the shadow device of storage-engine.device[ix].

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_used_bytes
Description

The number of bytes used for data on [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]).

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].write_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_write_q
Description

The number of wblocks queued to be written to [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]). Includes blocks written by the defragmentation sub-system.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix].writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_device_writes
Description

Number of wblocks written to [storage-engine.device[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.device[ix]) since Aerospike started. Does not include defragmentation writes.

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.device[ix]
optional
Context
namespace
Prometheus Name
Label "device" and "device_index" in all aerospike_namespace_storage_engine_device_* metrics
Description

The raw device that is configured in device configuration in namespace context and storage-engine subcontext. ‘ix’ is the device index. The index value starts from 0. For example, storage-engine.device[0]=/dev/xvd1 and storage-engine.device[1]=/dev/xvc1 for 2 devices specified in the configuration.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
storage-engine.file[ix].age
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_age
Description

Shows the percentage of lifetime (total usage) claimed by OEM for the underlying device of storage-engine.file[ix]. Value will be -1 unless underlying device is NVMe and may exceed 100.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].defrag_partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_defrag_partial_writes
Description

The number of wblocks partial flushed to [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]) by defrag.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.file[ix].defrag_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_defrag_q
Description

The number of wblocks queued to be defragged on [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]).

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].defrag_reads
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_defrag_reads
Description

Number of wblocks that have been sent to the defrag_q from storage-engine.file[ix].

Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].defrag_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_defrag_writes
Description

The number of wblocks defrag has written to [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]).

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].free_wblocks
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_free_wblocks
Description

The number of wblocks (write blocks) free on [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]).

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_partial_writes
Description

The number of wblocks partial flushed to [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]) by writes.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.file[ix].shadow_write_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_shadow_write_q
Description

The number of wblocks queued to be written to the shadow file of storage-engine.file[ix].

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_used_bytes
Description

Number of bytes used for data on [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]).

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].write_q
warn
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_write_q
Description

Number of wblocks queued to be written to [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]).

Monitoring

Measured per-device or per-file depending on the storage configuration.

If storage-engine.device[ix].write_q or storage-engine.file[ix].write_q is greater than 1, alert operations to investigate.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix].writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_file_writes
Description

The number of wblocks written to [storage-engine.file[ix]](/database/reference/metrics e&context=all&version=all&severity=all#namespace__storage-engine.file[ix]) since Aerospike started. When running with commit-to-device set to true, this counter will only account for full blocks written and therefore will only count blocks written through the defragmentation process as client writes would write to disk individually rather than at a block level. Includes defragmentation writes.

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
storage-engine.file[ix]
optional
Context
namespace
Prometheus Name
Label "file" and "file_index" in all aerospike_namespace_storage_engine_file_* metrics
Description

The data file path that is configured in file configuration in namespace context and storage-engine subcontext. ‘ix’ is the file index. The index value starts from 0. For example, storage-engine.file[0]=/opt/aerospike/test0.dat and storage-engine.file[1]=/opt/aerospike/test2.dat for 2 files specified in the configuration.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
storage-engine.stripe[ix].age
enterpriseoptional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_age
Description

Shows the percentage of lifetime (total usage) claimed by OEM for the respective storage-backed persistence device of storage-engine.stripe[ix]. The value will be -1 unless the underlying device is NVMe and may exceed 100, check storage-engine.device[ix].age. This statistic is not available in the log ticker and is only applicable if a storage-backed persistence exists.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].backing_write_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_backing_write_q
Description

The number of wblocks queued to be written to the respective storage-backed persistence of storage-engine.stripe[ix]. This statistic is available in the log ticker as write-q, and is only applicable if a storage-backed persistence exists.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragenamespacestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

Log ticker example with storage-backed persistence:

INFO (drv-mem): (drv_mem.c:3158) {bar} stripe-0.0xad001000: used-bytes 146499360 free-wblocks 492 write (18,0.2) defrag-q 0 defrag-read (1,0.0) defrag-write (0,0.0) write-q 0

Log ticker example without storage-backed persistence:

INFO (drv-mem): (drv_mem.c:3158) {test} stripe-2.0xad002002: used-bytes 887120 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-5.0xad002005: used-bytes 915280 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-1.0xad002001: used-bytes 900080 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-3.0xad002003: used-bytes 896720 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-0.0xad002000: used-bytes 909120 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-7.0xad002007: used-bytes 898960 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-6.0xad002006: used-bytes 897040 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
INFO (drv-mem): (drv_mem.c:3158) {test} stripe-4.0xad002004: used-bytes 895680 free-wblocks 62 write (0,0.0) defrag-q 0 defrag-read (0,0.0) defrag-write (0,0.0)
storage-engine.stripe[ix].defrag_partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_defrag_partial_writes
Description

The number of wblocks partial flushed to [storage-engine.stripe[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.stripe[ix]) by defrag.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.stripe[ix].defrag_q
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_defrag_q
Description

The number of wblocks queued to be defragged on storage-engine.stripe[ix].

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstorage
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].defrag_reads
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage-engine_stripe_defrag_reads
Description

Number of wblocks that have been sent to the defrag_q from storage-engine.stripe[ix].

Blocks are selected for defragmentation when their usage falls below the configured defrag-lwm-pct.

Introduced
7.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].defrag_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_defrag_writes
Description

The number of wblocks defrag has written to storage-engine.stripe[ix].

Introduced
7.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].free_wblocks
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage-engine_stripe_free_wblocks
Description

Number of wblocks (write blocks) free on storage-engine.stripe[ix].

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].partial_writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_partial_writes
Description

The number of wblocks partial flushed to [storage-engine.stripe[ix]](/database/reference/metrics ike&context=all&version=all&severity=all#namespace__storage-engine.stripe[ix]) by writes.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
storage-engine.stripe[ix].used_bytes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage_engine_stripe_used_bytes
Description

Number of bytes used for data on storage-engine.stripe[ix].

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix].writes
optional
Context
namespace
Prometheus Name
aerospike_namespace_storage-engine.stripe[ix].writes
Description

The number of wblocks written to storage-engine.stripe[ix] since Aerospike started. When running with commit-to-device set to true, this counter will only account for full blocks written and therefore will only count blocks written through the defragmentation process as the client writes would write to disk individually rather than at a block level. Includes defragmentation writes.

Introduced
7.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

storage-engine.stripe[ix]
optional
Context
namespace
Prometheus Name
Label "stripe" and "stripe_index" in all aerospike_namespace_storage_engine_stripe_* metrics
Description

Stripe is a shared memory segment. Each stripe will have its respective shared memory key, which is internally determined by the server. ‘ix’ is the stripe index. For example, if there are eight stripes, the index(ix) value will be from 0 to 7. So, storage-engine.stripe[0]=stripe-0.0xad002000 and storage-engine.stripe[1]=stripe-1.0xad002001 will show two shared memory segments (stripes) and their keys. This statistic applies to the namespaces configured with storage-engine memory.

Introduced
7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancensstoragestorage-engine
Detail

More information about stripe allocation can be found on the “Configure Namespace Storage” page, under Setup for in-memory with storage-backed persistence and Setup for in-memory without storage-backed persistence.

sub_objects
optional
Context
namespace
Prometheus Name
aerospike_namespace_sub_objects
Description

Number of LDT sub objects. Also aggregated at the service statistic level under the same name.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
tombstones
watch
Context
namespace
Prometheus Name
aerospike_namespace_tombstones
Description

Total number tombstones in this namespace on this node.

Introduced
3.10
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
truncate_lut
optional
Context
namespace
Prometheus Name
aerospike_namespace_truncate_lut
Description

‘The most covering truncate_lut for this namespace. See truncate or truncate-namespace.’

Introduced
3.12
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
truncated_records
optional
Context
namespace
Prometheus Name
aerospike_namespace_truncated_records
Description

The total number of records deleted by truncation for this namespace (includes set truncations). See truncate or truncate-namespace.

Introduced
3.12
Removed
6.3
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
truncating
optional
Context
namespace
Prometheus Name
aerospike_namespace_truncating
Description

Indicates when the namespace is in the process of being truncated.

Introduced
6.3
Removed
-
Measurement type
gauge
Data type
boolean
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_lang_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_lang_delete_success
Description

Number of successful UDF delete sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_lang_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_lang_error
Description

Number of UDF sub-transactions errors for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_lang_read_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_lang_read_success
Description

Number of successful UDF read sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_lang_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_lang_write_success
Description

Number of successful UDF write sub-transactions for scan/query background UDF jobs. See the udf_sub_udf_complete, udf_sub_udf_error, udf_sub_udf_filtered_out, udf_sub_udf_timeout statistics for the containing UDF operation statuses.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_tsvc_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_tsvc_error
Description

Number of UDF subtransactions that failed with an error in the transaction service, before attempting to handle the transaction for scan/query background UDF jobs. For example protocol errors or security permission mismatch. Does not include timeouts. In strong-consistency enabled namespaces, this includes transactions against unavailable_partitions and dead_partitions.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_tsvc_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_tsvc_timeout
Description

Number of UDF subtransactions that timed out in the transaction service, before attempting to handle the transaction for scan/query background UDF jobs.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_udf_complete
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_udf_complete
Description

Number of completed UDF subtransactions for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_udf_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_udf_error
Description

Number of failed UDF subtransactions for scan/query background UDF jobs. Does not include timeouts. See the following statistics for the underlying operation statuses:udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_udf_filtered_out
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_udf_filtered_out
Description

Number of UDF subtransactions that did not happen because the record was filtered out with Filter Expressions.

Introduced
4.7
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
udf_sub_udf_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_udf_sub_udf_timeout
Description

Number of UDF subtransactions that timed out for scan/query background UDF jobs. See the following statistics for the underlying operation statuses: udf_sub_lang_delete_success, udf_sub_lang_error, udf_sub_lang_read_success, udf_sub_lang_write_success.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
unavailable_partitions
critical
Context
namespace
Prometheus Name
aerospike_namespace_unavailable_partitions
Description

Number of unavailable partitions for this namespace (when using strong-consistency). This is the number of partitions that are unavailable when roster nodes are missing. Will turn into dead_partitions if still unavailable when all roster nodes are present.

Monitoring

IF unavailable_partitions is not zero, critical ALERT.

Check for network issues and make sure the cluster forms properly.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail
unreplicated_records
optional
Context
namespace
Prometheus Name
aerospike_namespace_unreplicated_records
Description

Number of unreplicated records in the namespace. Applicable only for namespaces operating under the strong-consistency mode.

Introduced
5.7
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
write-smoothing-period
optional
Context
namespace
Prometheus Name
aerospike_namespace_write-smoothing-period
Description

Removed

Introduced
-
Removed
Yes
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_bin_cemeteries
watch
Context
namespace
Prometheus Name
aerospike_namespace_xdr_bin_cemeteries
Description

Number of tombstones with bin tombstones. They are generated when bin convergence is enabled and a record is durably deleted.

Introduced
5.5
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_delete_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_delete_error
Description

Number of delete requests initiated by XDR that failed on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_delete_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_delete_not_found
Description

Number of delete requests initiated by XDR that failed on the namespace on this node due to the record not being found. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, [xdr_client_delete_error](/database/reference/metrics#namespace__xdr_client_delete_error(, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_delete_success
Description

Number of delete requests initiated by XDR that succeeded on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_delete_timeout
Description

Number of delete requests initiated by XDR that timed out on the namespace on this node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_write_error
watch
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_write_error
Description

Number of write requests initiated by XDR that failed on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_write_success
watch
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_write_success
Description

Number of write requests initiated by XDR that succeeded on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_client_write_timeout
watch
Context
namespace
Prometheus Name
aerospike_namespace_xdr_client_write_timeout
Description

Number of write requests initiated by XDR that timed out on the namespace on this node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_delete_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_delete_error
Description

Number of errors for XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_delete_not_found
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_delete_not_found
Description

Number of XDR delete commands proxied from another node that resulted in not found. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_delete_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_delete_success
Description

Number of successful XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_delete_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_delete_timeout
Description

Number of timeouts for XDR delete commands proxied from another node. For the total number of XDR initiated delete requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_delete_success, xdr_client_delete_error, xdr_client_delete_timeout, xdr_client_delete_not_found, xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, xdr_from_proxy_delete_not_found.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_write_error
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_write_error
Description

Number of errors for XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_write_success
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_write_success
Description

Number of successful XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_from_proxy_write_timeout
optional
Context
namespace
Prometheus Name
aerospike_namespace_xdr_from_proxy_write_timeout
Description

Number of timeouts for XDR write commands proxied from another node. For the total number of XDR initiated write requests against this namespace on this node (destination node), add up the relevant XDR client and from_proxy statistics: xdr_client_write_success, xdr_client_write_error, xdr_client_write_timeout, xdr_from_proxy_write_success, xdr_from_proxy_write_error, xdr_from_proxy_write_timeout.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
xdr_tombstones
enterprisewatch
Context
namespace
Prometheus Name
aerospike_namespace_xdr_tombstones
Description

Number of tombstones on this node which are created by XDR for non-durable client deletes. This includes both master and prole.

Introduced
5.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
Detail

For namespaces configured with XDR, non-durable delete transactions create XDR tombstones (not to be confused with the durable delete tombstones).

XDR tombstones are deleted after they have been shipped via XDR. The XDR tomb raider runs as specified in xdr-tomb-raider-period and uses xdr-tomb-raider-threads to reduce the index and delete XDR tombstones where the last update time (LUT) is older than the current global last ship time (GLST). The GLST is computed as the lowest value across the last ship time (LST) of all the partitions for the namespace. This is done by having each node send the LST for each partition they own to the principal node which then determines the lowest value and sends it back to all nodes in the cluster via the system metadata (SMD) fabric channel.

Node_stats

batch_index_complete
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_complete
Description

Number of batch index requests completed.

Introduced
3.6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_created_buffers
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_created_buffers
Description

Number of 128KB response buffers created. Response buffers are created when there are no buffers left in the pool. If this number consistently increases and there is available memory, you should increase batch-max-unused-buffers.

Introduced
3.6.4
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_delay
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_delay
Description

Number of times a batch index response buffer has been delayed (WOULDBLOCK on the send). The number of times a batch index transaction is completely abandoned because it went over its overall allocated time after being delayed is counted under the batch_index_error statistic and will have a WARNING log message associated.

Introduced
4.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_destroyed_buffers
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_destroyed_buffers
Description

Number of 128KB response buffers destroyed. Response buffers are destroyed when there is no slot left to put the buffer back into the pool. The maximum response buffer pool size is batch-max-unused-buffers.

Introduced
3.6.4
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_error
warn
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_error
Description

Number of batch index requests that completed with an error when, for example, the client has timed out but the server is still attempting to send response buffers back. Another occurrence is if the server abandons the transaction due to encountering delays (WOULDBLOCK on send) of more than twice the total timeout set by the client, or 30 seconds if not set when sending response buffers back. This is accompanied by a WARNING log message. Starting with version 6.4, this statistic is incremented when a transaction experiences delays exceeding the client timeout by a factor of 1. Each encountered delay is counted under the batch_index_delay statistic.

Monitoring

Compare batch_index_error to batch_index_complete. If ratio is higher than acceptable, alert Operations to investigate.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_huge_buffers
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_huge_buffers
Description

Number temporary response buffers created that exceeded 128KB. Huge buffers are created when one of the records is retrieved that is greater than 128KB. Huge records do not benefit from batching and can result in excessive memory thrashing on the server. The batch_index_created_buffers and batch_index_destroyed_buffers do include the huge buffers created and destroyed.

Introduced
3.6.4
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_initiate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_initiate
Description

Number of batch index requests received.

Introduced
3.6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_proto_compression_ratio
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_proto_compression_ratio
Description

Measures the average compressed size to uncompressed size ratio for protocol message data in batch index responses. Thus 1.000 indicates no compression and 0.100 indicates a 1:10 compression ratio (90% reduction in size).

Introduced
4.8
Removed
-
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitude
Detail

The compression ratio is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the compression ratio will change with it. In case of a sudden change in response data, the indicated compression ratio may lag behind a bit. As a rule of thumb, assume that the compression ratio covers the most recent 100,000 to 1,000,000 client responses.

batch_index_proto_uncompressed_pct
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_proto_uncompressed_pct
Description

Measures the percentage of batch index responses with uncompressed protocol message data. Thus 0.000 indicates all responses with compressed data, and 100.000 indicates no responses with compressed data. For example, if protocol message data compression is not used, this metric will remain set to 0.000. If protocol message data compression is then turned on and all responses are compressed, this metric will remain set to 0.000. The only way this metric will ever be set to a value different than 0.000 is if compression is used, but some responses are not compressed (which happens when the uncompressed size is so small that the server does not try to compress, or when the compression fails).

Introduced
4.8
Removed
-
Measurement type
gauge
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitude
Detail

The percentage is a moving average. It is calculated based on the most recent client responses. If the response message data changes over time then the percentage will change with it. In case of a sudden change in response data, the indicated percentage may lag behind a bit. As a rule of thumb, assume that the percentage covers the most recent 100,000 to 1,000,000 client responses.

batch_index_queue
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_queue
Description

Number of batch index requests (transactions count) processed and response buffer blocks used on each batch queue.
Format: Q1_REQUESTS:Q1_BUFFERS, Q2_REQUESTS:Q2_BUFFERS, ...

The buffer block counter is actually decremented on batch responses before the transaction count is decremented. Therefore, it is possible for a buffer slot becomes available on the queue and a new batch transaction count is incremented before the previous batch command count is decremented. It is also possible that multiple transactions came in for a thread for which none of the response buffers has been created yet. Finally, batch_index_huge_buffers are counted as part of the buffer blocks used on each batch queue.

Introduced
3.6.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_timeout
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_timeout
Description

Number of batch index requests that timed-out on the server before being processed. Those would be caused by a batch subtransaction that has timed out for this batch index transaction. The overall time allowed for a batch-index transaction on the server is not bound, except if a delay is encountered (WOULDBLOCK on send).

For Database 4.1 through 6.3, the overall batch index transaction max delay time is twice the total timeout set by the client, or 30 seconds if there is no timeout set by the client.

For Database 6.4 and later, the overall batch index transaction max delay time is the same as set by the client, or 30 seconds if there is no timeout set by the client.

Introduced
3.6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
batch_index_unused_buffers
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_batch_index_unused_buffers
Description

Number of available 128 KB response buffers currently in buffer pool.

Introduced
3.6.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
client_connections
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_client_connections
Description

Number of active client connections to this node. Also available in the log on the fds proto ticker line.

Monitoring
  • If client_connections is below an expected low value, then this condition might indicate a problem with the network between clients and server.

  • If client_connections is greater than an expected high value, then this condition might indicate a problem with clients rapidly opening and closing sockets.

  • If client_connections is at or near proto_fd_max, then the server is either currently unable to accept new connections or might soon be unable to do so.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
client_connections_closed
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_client_connections_closed
Description

Number of client connections that have been closed. One of client_connections_opened or client_connections_closed should be closely monitored or alerted against. Also available in the log on the fds proto ticker line.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
client_connections_opened
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_client_connections_opened
Description

Number of client connections created to this node since the node was started. One of client_connections_opened or client_connections_closed should be closely monitored or alerted against. Also available in the log on the fds proto ticker line.

Monitoring

If client_connections_opened changes unexpectedly without clients having been added or removed, or a significant change in workload having occurred, this condition might indicate a slow down on a node or a connectivity issue on the node.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_clock_skew_ms
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_clock_skew_ms
Description

Current maximum clock skew in milliseconds between nodes in a cluster. Will trigger clock_skew_stop_writes when breaching the cluster_clock_skew_stop_writes_sec threshold. This threshold is normally 20 seconds for strong-consistency namespaces on any Aerospike version, or 40 seconds for AP namespaces where NSUP is enabled (nsup-period is not zero) in Database 4.5.1 or later.

Introduced
4.0.0.4
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_clock_skew_stop_writes_sec
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_clock_skew_stop_writes_sec
Description

The threshold at which any namespace that is set to strong-consistency stops accepting writes due to clock skew (cluster_clock_skew_ms).

This value is in seconds, not milliseconds.

Although this value shows as 0 for AP namespaces, starting with Database 4.5.1, these namespaces stop accepting writes if NSUP is enabled (nsup-period is not zero) and the clock skew exceeds 40 seconds.

Introduced
4.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_generation
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_generation
Description

A 64 bit unsigned integer incremented on a node for every successful cluster partition re-balance or transition to orphan state. This is a node local value and does not need to be the same across the cluster.

Introduced
4.3
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_integrity
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_integrity
Description

When false, indicates integrity issues within the cluster, meaning that some nodes are either faulty or dead. A node in the succession list is deemed faulty if the node is alive and it reports to be an orphan or is part of some other cluster. Another condition for a faulty node would be for it to be alive but having a clustering protocol identifier that does not match the rest of the cluster. When true, indicates that the cluster is in a whole and complete state (as far as the nodes that it sees and is able to connect to all concerned). Information about a cluster integrity fault is also logged to the server log file repeatedly.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_is_member
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_is_member
Description

When false, indicates that the node is not joined to a cluster; that is, it is an orphan. When true, indicates that the node is joined to a cluster.

Introduced
3.13.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_key
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_key
Description

Randomly generated 64 bit hexadecimal string used to name the last Paxos cluster state agreement.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_max_compatibility_id
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_max_compatibility_id
Description

Each node has a compatibility ID that is an integer based on the node’s database version. During upgrades, this value is used to determine software compatibility. cluster_max_compatibility_id indicates the cluster’s maximum software version. See cluster_min_compatibility_id.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_min_compatibility_id
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_min_compatibility_id
Description

Each node has a compatibility ID that is an integer based on the node’s database version. During upgrades, this value is used to determine software compatibility. cluster_min_compatibility_id indicates the cluster’s minimum software version. See cluster_max_compatibility_id.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_principal
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_principal
Description

This specifies the Node ID of the current cluster principal. Will be ‘0’ on an orphan node.

Introduced
4.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
cluster_size
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_cluster_size
Description

Size of the cluster. Can be checked to make sure the size of the cluster is the expected one after adding or removing a node. Check across all nodes in a cluster.

Monitoring

If cluster_size does not equal the expected cluster size and the cluster is not undergoing maintenance, your operations group needs to investigate.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
demarshal_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_demarshal_error
Description

Number of errors during the demarshal step.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_batch_sub_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_batch_sub_error
Description

Number of errors early in the transaction for batch subtransactions. For example, bad/unknown namespace name or security authentication errors.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
7.2
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_client_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_client_error
Description

Number of errors early in the transaction for direct client requests. Those include transactions hitting the proto-fd-max, transactions with a bad/unknown namespace name or security authentication errors. Those also include cases where partitions are unavailable in AP mode, when clients attempt transactions against an orphan node.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_from_proxy_batch_sub_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_from_proxy_batch_sub_error
Description

Number of errors early in the commands for batch subtransactions proxied from another node. For example, bad or unknown namespace name or security authentication errors.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_from_proxy_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_from_proxy_error
Description

Number of errors early in the commands for commands, other than batch subtransactions, proxied from another node, for example, bad or unknown namespace name or security authentication errors.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_ops_sub_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_ops_sub_error
Description

Number of errors early in an internal ops subtransaction (records accessed by a background query operate command). For example, bad or unknown namespace name or security authentication errors.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
4.7
Removed
7.2
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
early_tsvc_udf_sub_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_early_tsvc_udf_sub_error
Description

Number of errors early in the transaction for UDF subtransactions. For example, bad or unknown namespace name or security authentication errors.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
3.9
Removed
7.2
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
entries_per_bval
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_entries_per_bval
Description

Ratio of entries to unique bvals (bin values) for a given secondary index on the node. The value is an integer (rounded to the nearest integer) and is calculated using hyperloglog estimates for unique bvals. The stat is generated by a background process. A value of 0 means the stat is not yet generated. The process runs at startup, every hour thereafter, and when a secondary index is created and populated.

Monitoring

This stat appears in the response to the ‘sindex-stat’ info command to retrieve statistics for a specified namespace and index. For example, asinfo -v 'sindex-stat:ns=namespace1;indexname=index21'.

Introduced
6.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
entries_per_rec
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_entries_per_rec
Description

Ratio of entries to unique records for a given secondary index on the node. This value will always be 1 if it is not a list or map secondary index. The value is an integer (rounded to the nearest integer) and is calculated using hyperloglog estimates for unique recs. The stat is generated by a background process. A value of 0 means the stat is not yet generated. The process runs at startup, every hour thereafter, and when a secondary index is created and populated.

Monitoring

This stat appears in the response to the ‘sindex-stat’ info command to retrieve statistics for a specified namespace and index. For example, asinfo -v 'sindex-stat:ns=namespace1;indexname=index21'.

Introduced
6.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
err_storage_defrag_fd_get
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_err_storage_defrag_fd_get
Description

Removed

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
err_sync_copy_null_node
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_err_sync_copy_null_node
Description

Number of errors during cluster state exchange because of missing general node information.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_bulk_recv_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_bulk_recv_rate
Description

Rate of traffic (bytes/sec) received by the fabric bulk channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_bulk_send_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_bulk_send_rate
Description

Rate of traffic (bytes/sec) sent by the fabric bulk channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_connections
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_connections
Description

Number of active fabric connections to this node. Also available in the log on the fds proto ticker line.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_connections_closed
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_connections_closed
Description

Number of fabric connections that have been closed. Also available in the log on the fds proto ticker line.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_connections_opened
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_connections_opened
Description

Number of fabric connections created to this node since the node was started. Also available in the log on the fds proto ticker line.

Monitoring

If fabric_connections_opened is unexpectedly changing, alert as this condition would indicate a connectivity problem with a node or a cluster change.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_ctrl_recv_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_ctrl_recv_rate
Description

Rate of traffic (bytes/sec) received by the fabric ctrl channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_ctrl_send_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_ctrl_send_rate
Description

Rate of traffic (bytes/sec) sent by the fabric ctrl channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_meta_recv_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_meta_recv_rate
Description

Rate of traffic (bytes/sec) received by the fabric meta channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_meta_send_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_meta_send_rate
Description

Rate of traffic (bytes/sec) sent by the fabric meta channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_rw_recv_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_rw_recv_rate
Description

Rate of traffic (bytes/sec) received by the fabric meta channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
fabric_rw_send_rate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_fabric_rw_send_rate
Description

Rate of traffic (bytes/sec) sent by the fabric rw channel during the last ticker-interval (every 10 seconds by default).

Introduced
3.11.1.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
failed_best_practices
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_failed_best_practices
Description

Indicates true if any of the best-practices, which are checked when the server starts, were violated, otherwise failed_best_practices will indicate false. Each failed best-practice will log a unique warning message and a list of failed best-practices can be queried using the best-practices info command.

Introduced
5.7
Removed
-
Measurement type
gauge
Data type
boolean
Labels
cluster_namejobserviceinstancelongitudelatitude
heap_active_kbytes
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_heap_active_kbytes
Description

The amount of memory in in-use pages, in KiB. An in-use page is a page that has some allocated memory (either partial or full).

Introduced
3.10.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heap_allocated_kbytes
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_heap_allocated_kbytes
Description

The amount of memory, in KiB, allocated by the asd daemon. The heap_allocated_kbytes / heap_active_kbytes ratio (6.0 or later) and heap_allocated_kbytes / heap_mapped_kbytes ratio (prior to 6.0) (also provided under heap_efficiency_pct) provide a picture of the fragmentation of the heap. This is for all memory usage except for the shared memory parts (for the primary index in the Enterprise Edition).

Introduced
3.10.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heap_efficiency_pct
warn
Context
node_stats
Prometheus Name
aerospike_node_stats_heap_efficiency_pct
Description

Provides an indication of the jemalloc heap fragmentation. This represents the heap_allocated_kbytes / heap_active_kbytes ratio. A lower number indicates a higher fragmentation rate.

Monitoring

If heap_efficiency_pct goes below 60% or 50% (depending on configuration, advise your operations group to investigate.

Introduced
3.10.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heap_mapped_kbytes
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_heap_mapped_kbytes
Description

Amount of memory in mapped pages in KiB, such as the amount of memory that JEM received from the Linux kernel. Should be a multiple of 4, which is the typical page size (4096 bytes).

Introduced
3.10.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heap_site_count
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_heap_site_count
Description

Number of distinct sites in the server code (specific locations in server functions) that have allocated heap memory designated for tracking as governed by the debug-allocations setting from the time when the server was started. The heap_site_count is only nonzero when debug-allocations is set to a value other than none. The heap_site_count value can only increase.

Introduced
3.14.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heartbeat_connections
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_heartbeat_connections
Description

Number of active heartbeat connections to this node. Also available in the log on the fds proto ticker line.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heartbeat_connections_closed
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_heartbeat_connections_closed
Description

Number of heartbeat connections that have been closed. Also available in the log on the fds proto ticker line.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heartbeat_connections_opened
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_heartbeat_connections_opened
Description

Number of heartbeat connections created to this node since the node was started. Also available in the log on the fds proto ticker line.

Monitoring

If heartbeat_connections_opened is unexpectedly changing, alert as this condition would indicate a connectivity problem with a node or a cluster change.

Introduced
5.6
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heartbeat_received_foreign
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_heartbeat_received_foreign
Description

Total number of heartbeats received from remote nodes.

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
heartbeat_received_self
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_heartbeat_received_self
Description

Total number of multicast heartbeats from this node received by this node. Will be 0 for mesh.

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
info_complete
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_info_complete
Description

Number of info requests completed.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
info_queue
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_info_queue
Description

Number of info requests pending in info queue.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
info_timeout
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_info_timeout
Description

Tracks total timed-out info transactions. Related to info-max-ms.

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
long_queries_active
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_long_queries_active
Description

Number of queries currently active (formerly queries_active or scans_active). The long_queries_active stat is shared by both primary index (PI) queries and secondary index (SI) queries. Only long queries are monitored.

Introduced
6.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
migrate_allowed
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_migrate_allowed
Description

This indicates whether migrations are allowed or not on a node. true when allowed, false when not. When there is a change in a cluster, this statistic’s value will change to false until the rebalance is completed across all namespaces. The rebalance is the step that figures out all partition migrations that need to be scheduled. The rebalance is not the migrations itself but the process that precedes the partitions migrations. migrate_allowed true indicates that all migrations related statistics have been set and can be leveraged programmatically, for example, migrate_partitions_remaining to check if migrations are ongoing or not).

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
migrate_partitions_remaining
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_migrate_partitions_remaining
Description

This is the number of partitions remaining to migrate (in either direction). When migrate_allowed is true, this is the stat which will accurately determine if migrations are complete for a single node across all namespaces. There could be a short period after a reclustering event when this statistic shows 0 but the migrations have not started yet. During such time, migrate_allowed would return false.

Introduced
3.8.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
objects
watch
Context
node_stats
Prometheus Name
aerospike_node_stats_objects
Description

Total number of replicated objects on this node. Includes master and replica objects.

Monitoring

Trending objects provides operations insight into object fluctuations over time.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
paxos_principal
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_paxos_principal
Description

Identifier for the node in which this node believes to be the Paxos Principal.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
process_cpu_pct
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_process_cpu_pct
Description

Percentage of CPU usage by the asd process.

Introduced
4.7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
proxy_in_progress
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_proxy_in_progress
Description

Number of proxies in progress. Also called proxy hash. The command’s TTL (client set timeout or transaction-max-ms is checked every 5ms (Database 6.0 and later) when waiting in the proxy-hash.

Introduced
3.3.21
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
queries_active
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_queries_active
Description

Number of queries currently active (formerly scans_active). The bqueries_active stat is shared by both primary index (PI) queries and secondary index (SI) queries. Only long queries are monitored. Removed in Database 6.1, use long_queries_active.

Introduced
6.0
Removed
6.1
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
query_bad_records
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_query_bad_records
Description

Number of false positive entries in secondary index queries.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
query_long_running
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_query_long_running
Description

Number of long running queries ever attempted in the system (query selected record more than query_threshold).

Introduced
-
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
query_short_running
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_query_short_running
Description

Number of short running queries ever attempted in the system (query selected record less than query_threshold).

Introduced
-
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
query_tracked
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_query_tracked
Description

Number of queries tracked by the system. (Number of queries which ran more than query untracked_time (default 1 sec)).

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
read_touch_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_error
Description

Number of read touch errors which were not timeouts.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
read_touch_skip
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_skip
Description

Number of touches abandoned upon finding that another write (including an earlier touch) has taken place or is taking place, removing the need to proceed with the touch.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
read_touch_success
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_success
Description

Number of successful read touches.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
read_touch_timeout
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_timeout
Description

Number of touches that ended in timeout.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
read_touch_tsvc_error
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_tsvc_error
Description

Number of read touch subtransactions that failed with an error in the internal transaction queue. Does not include timeouts.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
read_touch_tsvc_timeout
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_read_touch_tsvc_timeout
Description

Number of read touches that time out early in the internal transaction queue, while waiting to be picked up by a service thread.

The transaction service (tsvc) subsystem implements the execution of read/write commands, including transactions, queries, and info commands. tsvc errors happen before records are accessed for reads or writes. They’re counted separately from tsvc timeouts.

Introduced
7.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancens
reaped_fds
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_reaped_fds
Description

Number of idle client connections closed.

Monitoring

If reaped_fds are growing more rapidly than normal , it may indicate client[s] are opening and closing sockets too rapidly — potential application issue.

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
rw_err_dup_write_cluster_key
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_rw_err_dup_write_cluster_key
Description

Removed

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
rw_err_dup_write_internal
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_rw_err_dup_write_internal
Description

Removed

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
rw_in_progress
warn
Context
node_stats
Prometheus Name
aerospike_node_stats_rw_in_progress
Description

Number of rw transactions in progress. Also called rw hash. This tracks transaction parked on the rw hash while processing on other nodes (all write replicas, read duplicate resolutions). The transaction’s TTL (client set timeout or transaction-max-ms is checked every 5ms in Database 6.0 and later when waiting in the rw-hash.

Monitoring

Depends on expected workload.

If rw_in_progress is higher than expected, or if this deviates more than acceptable from the established baseline over time,alert operations to investigate the cause. May indicate a slowdown on a particular node or overloading on the fabric.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
Detail

While a transaction is parked in the rw-hash, other transactions for the same record will be queued (those queued transactions wouldn’t be counted in this metric). Once a transaction completes, queued transactions for the same records get re-started (as tracked in the xxxx-restart benchmark histograms (such as write-restart). At that point, the first transaction to be processed will take the rw-hash slot and the other ones will wait for the next round. Transactions that need to be serialized (such as writes for the same record or a read transaction in strong consistency mode while a write transaction is in progress or any transaction requiring duplicate resolution) would not be proceed until they get their slot in the rw-hash.

scans_active
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_scans_active
Description

Number of scans currently active. Removed in Database 6.0, use queries_active.

Introduced
3.6.0
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_garbage_cleaned
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_garbage_cleaned
Description

Sum of secondary index garbage entries cleaned by sindex GC. Moved to namespace level as sindex_gc_cleaned in Database 5.7.

Introduced
3.3.10
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_garbage_found
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_garbage_found
Description

Sum of secondary index garbage entries found by sindex GC.

Introduced
3.3.10
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_list_creation_time
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_list_creation_time
Description

Sum of time spent in finding secondary index garbage entries by sindex GC (millisecond).

Introduced
3.3.10
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_list_deletion_time
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_list_deletion_time
Description

Sum of time spent in cleaning sindex garbage entries by sindex GC (millisecond).

Introduced
3.3.10
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_objects_validated
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_objects_validated
Description

Number of secondary index entries processed by sindex GC.

Introduced
3.3.10
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_gc_retries
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_gc_retries
Description

Number of retries when sindex GC cannot get sprigs lock. Replaced sindex_gc_locktimedout.

Introduced
4.2
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sindex_ucgarbage_found
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sindex_ucgarbage_found
Description

Number of un-cleanable garbage entries in the sindexes encountered through queries.

Introduced
3.3.3
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_err_ack_rw_trans_reenqueue
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_err_ack_rw_trans_reenqueue
Description

Number of Read/Write trans re-enqueued because of cluster key mismatch.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_partition_transaction_queue_count
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_partition_transaction_queue_count
Description

Removed/unused

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_prole_retry
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_prole_retry
Description

Number of times a prole write was retried as a result of a cluster key mismatch.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_regular_processed
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_regular_processed
Description

Number of successful transactions that passed the cluster key test.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_trans_to_proxy_retry
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_trans_to_proxy_retry
Description

Number of times a proxy was redirected.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_cluster_key_transaction_reenqueue
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_cluster_key_transaction_reenqueue
Description

Removed/unused

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_evicted_set_objects
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_evicted_set_objects
Description

Number of objects evicted from a Set due to set limits defined in Aerospike configuration.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_single_bin_records
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_single_bin_records
Description

Removed: Number of single bin records.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_slow_trans_queue_batch_pop
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_slow_trans_queue_batch_pop
Description

Number of times we moved a batch of trans from slow queue to fast queue.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_slow_trans_queue_pop
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_slow_trans_queue_pop
Description

Number of trans that were moved from slow queue to fast queue.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
stat_slow_trans_queue_push
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_stat_slow_trans_queue_push
Description

Number of trans that we pushed onto the slow queue.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
storage_defrag_wait
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_storage_defrag_wait
Description

Number of times the defrag waited (called sleep).

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
sub_objects
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_sub_objects
Description

Number of LDT sub objects. Aggregated over the sub_objects stat at the namespace level.

Introduced
3.9
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
system_free_mem_kbytes
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_system_free_mem_kbytes
Description

Amount of free system memory in kilobytes. Includes buffers and caches, but not shared memory.

Monitoring

If system_free_mem_kbytes is abnormally low, could indicate the server is approaching the limits of the available RAM. Operations should investigate and potentially add nodes or increase per node RAM.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
system_free_mem_pct
critical
Context
node_stats
Prometheus Name
aerospike_node_stats_system_free_mem_pct
Description

Percentage of free system memory.

Monitoring

If system_free_mem_pct is abnormally low, could indicate the server is approaching the limits of the available RAM. Operations should investigate and potentially add nodes or increase per node RAM.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
system_kernel_cpu_pct
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_system_kernel_cpu_pct
Description

Percentage of CPU usage by processes running in kernel mode.

Introduced
4.7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
system_thp_mem_kbytes
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_system_thp_mem_kbytes
Description

Amount of memory in use by the Transparent Huge Page mechanism, in kilobytes.

Introduced
5.7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
system_total_cpu_pct
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_system_total_cpu_pct
Description

Percentage of CPU usage by all running processes. Equal to system_user_cpu_pct + system_kernel_cpu_pct.

Introduced
4.7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
Detail
system_user_cpu_pct
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_system_user_cpu_pct
Description

Percentage of CPU usage by processes running in user mode.

Introduced
4.7.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
threads_detached
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_threads_detached
Description

Number of detached server threads currently running.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
threads_joinable
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_threads_joinable
Description

Number of joinable server threads currently running.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
threads_pool_active
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_threads_pool_active
Description

Number of currently active threads in the server thread pool.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
threads_pool_total
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_threads_pool_total
Description

Total number of threads in the server thread pool.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
time_since_rebalance
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_time_since_rebalance
Description

Number of seconds since the last reclustering event, either triggered by the recluster info command or by a cluster disruption (such as a node being add/removed or a network disruption).

Introduced
4.3.1
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
tree_gc_queue
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_tree_gc_queue
Description

This is the number of trees queued up, ready to be completely removed (partitions drop). Corresponds to the tree-gc-q entry in the log ticker.

Introduced
3.10
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
tscan_aborted
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_tscan_aborted
Description

Number of scans that were aborted. Removed as of 3.6.0.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
tscan_initiate
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_tscan_initiate
Description

Number of new scan requests initiated. Removed as of 3.6.0.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
tscan_pending
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_tscan_pending
Description

Number of scan requests pending. Removed as of 3.6.0.

Introduced
-
Removed
Yes
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
tscan_succeeded
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_tscan_succeeded
Description

Number of scan requests that have successfully finished. Removed as of 3.6.0.

Introduced
-
Removed
Yes
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
uptime
optional
Context
node_stats
Prometheus Name
aerospike_node_stats_uptime
Description

Time in seconds since last server restart.

Monitoring

If uptime is below 300 and the cluster is not undergoing maintenance this node restarted within the last 5 minutes. Advise operations to investigate.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude

Sets

device_data_bytes
optional
Context
sets
Prometheus Name
aerospike_sets_device_data_bytes
Description

Device storage used by this set in bytes, for the data part (does not include index part). Value will be 0 if data is not stored on device. For size used in memory, See memory_data_bytes.

Introduced
5.2
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
Detail
memory_data_bytes
optional
Context
sets
Prometheus Name
aerospike_sets_memory_data_bytes
Description

Memory used by this set in bytes, for the data part (does not include index part). Value will be 0 if data is not stored in memory. For size used on disk, See device_data_bytes (available in Database 5.2 and later), or the set level object size histogram.

Introduced
3.9
Removed
7.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
Detail
ns
optional
Context
sets
Prometheus Name
aerospike_sets_ns
Description

Namespace name this set belongs to.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
objects
watch
Context
sets
Prometheus Name
aerospike_sets_objects
Description

Total number of objects (master and all replicas) in this set on this node. This is updated in real time and is not dependent on the nsup-period or nsup-hist-period configurations.

Introduced
3.9
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
set
optional
Context
sets
Prometheus Name
aerospike_sets_set
Description

Name of this set.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
tombstones
watch
Context
sets
Prometheus Name
aerospike_sets_tombstones
Description

Total number of tombstones (master and all replicas) in this set on this node.

Introduced
3.10
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
truncate_lut
optional
Context
sets
Prometheus Name
aerospike_sets_truncate_lut
Description

‘The most covering truncate_lut for this set. See truncate or truncate-namespace.’

Introduced
3.12
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudensset
truncating
optional
Context
sets
Prometheus Name
aerospike_sets_truncating
Description

Indicates when the set is in the process of being truncated.

Introduced
6.3
Removed
-
Measurement type
gauge
Data type
boolean
Labels
cluster_namejobserviceinstancelongitudelatitudensset

Sindex

delete_error
optional
Context
sindex
Prometheus Name
aerospike_sindex_delete_error
Description

Number of errors while processing a delete transaction for this secondary index.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
delete_success
optional
Context
sindex
Prometheus Name
aerospike_sindex_delete_success
Description

Number of successful delete transactions processed for this secondary index.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
entries
optional
Context
sindex
Prometheus Name
aerospike_sindex_entries
Description

Number of secondary index entries for this secondary index. This is the number of records that have been indexed by this secondary index.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
ibtr_memory_used
optional
Context
sindex
Prometheus Name
aerospike_sindex_ibtr_memory_used
Description

Amount of memory, in bytes, the secondary index is consuming for the keys, as opposed to nbtr_memory_used which is the amount of memory the secondary index is consuming for the entries. The total being reported by si_accounted_memory.

Introduced
-
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
keys
optional
Context
sindex
Prometheus Name
aerospike_sindex_keys
Description

Number of secondary keys for this secondary index.

Introduced
-
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
load_pct
optional
Context
sindex
Prometheus Name
aerospike_sindex_load_pct
Description

Progress in percentage of the creation of secondary index.

Introduced
-
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
load_time
optional
Context
sindex
Prometheus Name
aerospike_sindex_load_time
Description

Time it took for the secondary index to be fully created.

Introduced
6.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
loadtime
optional
Context
sindex
Prometheus Name
aerospike_sindex_loadtime
Description

Time it took for the secondary index to be fully created.

Introduced
-
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
memory_used
optional
Context
sindex
Prometheus Name
aerospike_sindex_memory_used
Description

Amount of memory, in bytes, consumed by the secondary index. Renamed to used_bytes in Database 6.3. Do not use memory_used in Database 6.3 and later.

Introduced
6.0
Removed
6.3
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
nbtr_memory_used
optional
Context
sindex
Prometheus Name
aerospike_sindex_nbtr_memory_used
Description

Amount of memory, in bytes, the secondary index is consuming for the entries, as opposed to ibtr_memory_used which is the amount of memory the secondary index is consuming for the keys. The total being reported by si_accounted_memory.

Introduced
-
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_agg
Description

Number of query aggregations attempted for this secondary index on this node.

Introduced
-
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_avg_rec_count
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_agg_avg_rec_count
Description

Average number of records returned by the aggregations underlying queries against this secondary index.

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_agg_avg_record_size
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_agg_avg_record_size
Description

Average size of the records returned by the aggregations underlying queries against this secondary index.

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_avg_rec_count
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_avg_rec_count
Description

Average number of records returned by the all queries against this secondary index (combines query_agg_avg_rec_count and query_lookup_avg_rec_count).

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_avg_record_size
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_avg_record_size
Description

Average size of the records returned by all the queries against this secondary index (combines query_agg_avg_record_size and query_lookup_avg_record_size)

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_abort
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_basic_abort
Description

Number of basic queries aborted for this secondary index. Removed in Database 6.0, use si_query_long_basic_abort.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_avg_rec_count
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_basic_avg_rec_count
Description

Average number of records returned by the lookup queries against this secondary index.

Introduced
5.7
Removed
6.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_complete
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_basic_complete
Description

Number of basic queries completed for this secondary index. Removed in Database 6.0, use si_query_long_basic_complete.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_basic_error
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_basic_error
Description

Number of basic queries that returned error for this secondary index. Removed in Database 6.0, use si_query_long_basic_error.

Introduced
5.7
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_avg_rec_count
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_lookup_avg_rec_count
Description

Average number of records returned by the lookup queries against this secondary index. Renamed to query_basic_avg_rec_count in Database 5.7.

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookup_avg_record_size
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_lookup_avg_record_size
Description

Average size of the records returned by the lookup queries against this secondary index.

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_lookups
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_lookups
Description

Number of lookup queries ever attempted for this secondary index on this node. Removed in Database 5.7. Use query_basic_complete + query_basic_error + query_basic_abort instead.

Introduced
-
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
query_reqs
optional
Context
sindex
Prometheus Name
aerospike_sindex_query_reqs
Description

Number of query requests ever attempted for this secondary index on this node (combines query_lookups and query_agg).

Introduced
-
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_accounted_memory
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_accounted_memory
Description

Amount of memory, in bytes, the secondary index is consuming. Removed in Database 5.7 the sum of ibtr_memory_used and nbtr_memory_used.

Introduced
-
Removed
5.7
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_long_basic_abort
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_long_basic_abort
Description

Number of basic long secondary index queries aborted for this secondary index.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_long_basic_complete
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_long_basic_complete
Description

Number of basic long secondary index queries completed for this secondary index.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_long_basic_error
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_long_basic_error
Description

Number of basic long secondary index queries that returned error for this secondary index.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_short_basic_complete
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_short_basic_complete
Description

Number of basic short secondary index queries completed for this secondary index.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_short_basic_error
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_short_basic_error
Description

Number of basic short secondary index queries that returned error for this secondary index.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
si_query_short_basic_timeout
optional
Context
sindex
Prometheus Name
aerospike_sindex_si_query_short_basic_timeout
Description

Short queries are not monitored, so they cannot be aborted. They might time out, which is reflected in this statistic.

Introduced
6.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
stat_gc_recs
optional
Context
sindex
Prometheus Name
aerospike_sindex_stat_gc_recs
Description

Number of records that have been garbage collected out of the secondary index memory. See sindex-gc-period and sindex-gc-max-rate configuration parameters for tuning the secondary index garbage collection. ”

Introduced
-
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
stat_gc_time
optional
Context
sindex
Prometheus Name
aerospike_sindex_stat_gc_time
Description

Amount of time spent processing garbage collection for the secondary index. See sindex-gc-period and sindex-gc-max-rate configuration parameters for tuning the secondary index garbage collection.

Introduced
-
Removed
5.7
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
used_bytes
optional
Context
sindex
Prometheus Name
aerospike_sindex_used_bytes
Description

Amount of memory, in bytes, consumed by the secondary index.

NOTE: Renamed from memory_used in Database 6.3.

Introduced
6.3
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitude
write_error
optional
Context
sindex
Prometheus Name
aerospike_sindex_write_error
Description

Number of errors while processing a write transaction for this secondary index.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens
write_success
optional
Context
sindex
Prometheus Name
aerospike_sindex_write_success
Description

Number of successful write transactions processed for this secondary index.

Introduced
3.9
Removed
6.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudens

Users

conns_in_use
Context
users
Prometheus Name
aerospike_users_conns_in_use
Description

Number of client connections for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled, per node user metrics are available from the security protocol.

limitless_read_scan_query
Context
users
Prometheus Name
aerospike_users_limitless_read_scan_query
Description

Limitless read query requests per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
moving average
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics available from the security protocol. For more information, see Enable access control.

limitless_write_scan_query
Context
users
Prometheus Name
aerospike_users_limitless_write_scan_query
Description

Limitless write query requests per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.

read_scan_query_rps
Context
users
Prometheus Name
aerospike_users_read_scan_query_rps
Description

Read query requests per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. See Enable access control for more information about these metrics.

read_single_record_tps
Context
users
Prometheus Name
aerospike_users_read_single_record_tps
Description

Read transactions per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.

write_scan_query_rps
Context
users
Prometheus Name
aerospike_users_write_scan_query_rps
Description

Write query requests per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.

write_single_record_tps
Context
users
Prometheus Name
aerospike_users_write_single_record_tps
Description

Write transactions per second for a given user.

Monitoring

To see metrics from asadm use the command:

show users statistics

If you are using the Aerospike Prometheus Exporter these metrics are shown in the Users View.

Introduced
5.6
Removed
-
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudeuser
Detail

When security is enabled and enable-quotas is true, per node user metrics are available from the security protocol. For more information, see Enable access control.

Xdr

abandoned
warn
Context
xdr
Prometheus Name
aerospike_xdr_abandoned
Description

Number of records abandoned because of permanent failure at the destination. The destination configuration must be changed for these records to be successfully shipped.

Monitoring

If abandoned is consistently higher than expected alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
active_failed_node_sessions
optional
Context
xdr
Prometheus Name
aerospike_xdr_active_failed_node_sessions
Description

Number of active failed node sessions pending. A failed node session keeps track of node at the local cluster that have left the cluster and need other nodes to ship on their behalf until they join back.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Context
xdr
Prometheus Name
aerospike_xdr_active_link_down_sessions
Description

Number of active link down sessions pending. A link down session keeps track of destination clusters that are not reachable for a given time window.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
bytes_shipped
optional
Context
xdr
Prometheus Name
aerospike_xdr_bytes_shipped
Description

Number of bytes shipped for a namespace to a DC by XDR.

Monitoring

Use the asinfo command get-stats to report these metrics.

Introduced
6.1
Removed
-
Measurement type
counter
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudedc
compression_ratio
optional
Context
xdr
Prometheus Name
aerospike_xdr_compression_ratio
Description

Running average compression ratio. Example: asinfo -h localhost -l -v get-stats:context=xdr;dc=aerospike_b;namespace=test

Introduced
5.0.0
Removed
-
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_as_open_conn
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_as_open_conn
Description

Number of open connection to the Aerospike DC. If the DC accepts pipeline writes, there will be 64 connections per destination node. Replaced dc_open_conn starting with Database 4.4.

Introduced
4.4
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_as_size
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_as_size
Description

The cluster size of the destination Aerospike DC. Replaced by dc_size starting with Database 4.4.

Introduced
4.4
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_http_good_locations
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_http_good_locations
Description

Number of URLs that are considered healthy and being used by the change notification system. Part of the change notification.

Introduced
4.4
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_http_locations
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_http_locations
Description

Number of URLs configured for the HTTP destination. Part of the change notification.

Introduced
4.4
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_attempt
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_attempt
Description

Number of records that have been attempted to be shipped, but could have resulted in either success or error. See dc_ship_success for successfully shipped records.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_bytes
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_bytes
Description

Number of bytes shipped for this DC.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_delete_success
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_delete_success
Description

Number of delete transactions that have been successfully shipped. This is the per DC statistic for xdr_ship_delete_success.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_destination_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_destination_error
Description

Number of errors from the remote cluster(s) while shipping records for this DC. Errors include out-of-space, key-busy, etc. This is the per DC statistic for xdr_ship_destination_error.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_idle_avg
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_idle_avg
Description

Average number of ms of sleep for each record being shipped. 0.000 if there is no throttling. Throttling will occur if the set throughput limit (xdr-max-ship-throughput) has been reached or in case of unexpected slowdown at the destination cluster. This is part of the rsas entry in the logs (xdr context).

Introduced
3.9
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_idle_avg_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_idle_avg_pct
Description

Representation in percent of total time spent for dc_ship_idle_avg. This is part of the rsas entry in the logs (xdr context).

Introduced
3.9
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_inflight_objects
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_inflight_objects
Description

Number of records that are inflight (which have been shipped but for which a response from the remote DC has not yet been received).

Introduced
3.9
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_latency_avg
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_latency_avg
Description

Moving average of shipping latency for the specific DC.

Introduced
3.9
Removed
5.0.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_source_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_source_error
Description

Number of client layer errors while shipping records for this DC. Errors include timeout, bad network fd, etc. This is the per DC statistic for xdr_ship_source_error.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_ship_success
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_ship_success
Description

Number of records that have been successfully shipped. This is the per DC statistic for xdr_ship_success.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_state
optional
Context
xdr
Prometheus Name
aerospike_xdr_dc_state
Description

State of the DC. Here are the different statuses: CLUSTER_INACTIVE, CLUSTER_UP, CLUSTER_DOWN, CLUSTER_WINDOW_SHIP.
- The CLUSTER_INACTIVE state is for a DC that has not been seeded (configured) in the XDR stanza and would be a place holder for a future dynamic seeding.
- The CLUSTER_UP state is the normal state for a DC that is able to receive records from an XDR client and is currently not having any records being shipped to it from a previous window where it was down (which would be the CLUSTER_WINDOW_SHIP state).
- A cluster will be in CLUSTER_DOWN when the source (XDR client) cannot connect to it for over 30 seconds. This would prevent the entries in the digestlog to be reclaimed. The XDR client will periodically try to reconnect and upon succeeding, will spawn a window shipper to ‘catch up’ then entries in the digestlog that were missed. The DC specific lag (dc_timelag) will increase in such state but will not be accounted for in the overall XDR timelag (xdr_timelag).
- A cluster states switches to CLUSTER_WINDOW_SHIP when it can be re-connected to after being in CLUSTER_DOWN state. The DC specific lag (dc_timelag) will be accounted for in the overall XDR timelag (xdr_timelag).

Introduced
3.8.1
Removed
5.0.0
Measurement type
gauge
Data type
string
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dc_timelag
Context
xdr
Prometheus Name
aerospike_xdr_dc_timelag
Description

Time lag for this specific DC. See xdr_timelag for details of how this is calculated.

Monitoring

If dc_timelag consistently greater than a few seconds it may indicate network connectivity issues or errors writing at a destination cluster.

Introduced
3.8.1
Removed
5
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_free_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_free_pct
Description

Percentage of the digest log free and available for use.

Introduced
3.9
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_logged
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_logged
Description

Number of records logged into digest log.

Monitoring

Trending stat_recs_logged allows operations insight into how many records are being enqueued for shipment over time.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_overwritten_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_overwritten_error
Description

Number of digest log entries that got overwritten.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Context
xdr
Prometheus Name
aerospike_xdr_dlog_processed_link_down
Description

Number of linkdown that were processed.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_processed_main
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_processed_main
Description

Number of records processed on the local Aerospike server.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_processed_replica
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_processed_replica
Description

Number of records processed for a node in the cluster that is not the local node.

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
dlog_relogged
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_relogged
Description

Number of records relogged by this node into the digest log due to temporary issues when attempting to ship. A relogged digest log entry would be caused by one of three potential conditions: - An issue with the local client when attempting to ship (tracked by xdr_ship_source_error). - An issue with the network or the destination cluster itself (tracked by xdr_ship_destination_error). - An issue when reading the record on the local node(tracked by xdr_read_error), but those would actually end up relogged on the node now owning the record (see relogged_outgoing).

Introduced
3.9
Removed
5.0.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Detail

The XDR component typically processes only master record’s digest log entries on a given node (the exception being during failed node processing, when a node on the source cluster has failed). When relogging such master record’s dlog entry, the corresponding prole copy would also be relogged on the respective node holding the replicas. This would increment the relogged_outgoing statistic on the current node and the relogged_incoming on the receiving node. It is therefore expected to see the dlog_relogged and relogged_outgoing statistics matching for clusters that are stable (no migrations).

The relogs happening due to master partition ownership changes (migrations) are also tracked through relogged_incoming and relogged_outgoing.

Permanent errors will not be relogged but will have a WARNING log message at the destination cluster (for example, to name a few, invalid namespace, record too big if mismatched write-block-size between source and destination, authentication or permission error).

Some Permanent Errors: AEROSPIKE_ERR_RECORD_TOO_BIG, AEROSPIKE_ERR_REQUEST_INVALID, AEROSPIKE_ERR_ALWAYS_FORBIDDEN.
Some Transient Errors: AEROSPIKE_ERR_SERVER, AEROSPIKE_ERR_CLUSTER_CHANGE, AEROSPIKE_ERR_SERVER_FULL, AEROSPIKE_ERR_CLUSTER, AEROSPIKE_ERR_RECORD_BUSY, AEROSPIKE_ERR_DEVICE_OVERLOAD, AEROSPIKE_ERR_FAIL_FORBIDDEN.

See the C client errors for the exhaustive list.

dlog_used_objects
optional
Context
xdr
Prometheus Name
aerospike_xdr_dlog_used_objects
Description

Total number of records slots used in the digest log.

Introduced
3.9
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
filtered_out
watch
Context
xdr
Prometheus Name
aerospike_xdr_filtered_out
Description

Number of local records that are skipped after having been read but before actual shipment. Such records might be skipped because of the configured shipping rules. For example, if the rules exclude all bins of a record, the record is skipped.

This counter does not include records not submitted to the XDR queue, such as a record that is not eligible for shipping because its set is disabled.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
global_lastshiptime
optional
Context
xdr
Prometheus Name
aerospike_xdr_global_lastshiptime
Description

Minimum last ship time in millisecond (epoch) for XDR for across the cluster. Specifies to what point can slots in the digest log can be reclaimed, by tracking the oldest last ship time across all nodes in the cluster.

Introduced
3.10
Removed
5.0.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
hot_keys
watch
Context
xdr
Prometheus Name
aerospike_xdr_hot_keys
Description

Number of times a record write is skipped from processing because that record is already pending processing. This value also includes the number of records skipped for replica partitions.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
hotkey_fetch
optional
Context
xdr
Prometheus Name
aerospike_xdr_hotkey_fetch
Description

If there are hot keys in the system (same record updated quite frequently), XDR optimizes by not shipping all the updates. This stat represents the number of record’s digest that are actually shipped because their cache entries expired and were dirty. Interpret in conjunction with xdr_hotkey_skip. The timeout of the cache entries is controlled by xdr-hotkey-time-ms.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
hotkey_skip
optional
Context
xdr
Prometheus Name
aerospike_xdr_hotkey_skip
Description

Replaces noship_recs_dup_intrabatch and noship_recs_genmismatch. If there are hot keys in the system (same record updated quite frequently), XDR optimizes by not shipping all the updates. This stat represents the number of record’s digests that are skipped due to an already existing entry in the reader’s thread cache (meaning a version of this record was just shipped). Interpret in conjunction with xdr_hotkey_fetch. The timeout of the cache entries is controlled by xdr-hotkey-time-ms.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
in_progress
watch
Context
xdr
Prometheus Name
aerospike_xdr_in_progress
Description

Number of records that are pending completion. Records can be in different stages like local read, network send, pending acknowledgment. If a record is being retried (see retry_conn_reset, retry_dest, and retry_no_node, it is not considered complete and repeats the cycle.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
in_queue
watch
Context
xdr
Prometheus Name
aerospike_xdr_in_queue
Description

Number of records in the in-memory transaction queue still to be processed. These are the records which have been written into the xdr transaction-queue but have not been picked up yet to processed further by XDR.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
lag
critical
Context
xdr
Prometheus Name
aerospike_xdr_lag
Description

Lag in seconds between the destination and the source datacenters. This gives an indication of how much behind the source lags in term of shipping records, or, in other terms, how long have records been waiting at the source before being shipped to that DC.
Here are a bit more details:
The lag is the difference between the last update time of the records being shipped (called ‘last ship time’ or LST) and the current time. The LST is internally maintained per partition and aggregated at the namespace level (minimum across all partitions). The lag can seem unsettled (step function) while recoveries are in progress (See the recoveries_pending statistic). This is because the recovery for a partition can take a while and the LST is updated only on completion of a recovery pass (as opposed to per record). A recovery pass is considered complete only after the batch of records for a given partition is completely and successfully shipped (no elements left in the retry queue).

Monitoring

If lag is consistently greater than a few seconds, this condition might indicate network connectivity issues or errors writing at a destination cluster.<br /

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
lap_us
warn
Context
xdr
Prometheus Name
aerospike_xdr_lap_us
Description

Time in microseconds (μsecs) taken to process records across partitions in one lap (processing cycle). This is diagnostic information. A higher number indicates slowness of source in processing the records.

Available only at the dc level, not namespace level. Example: asinfo -h localhost -l -v get-stats:context=xdr;dc=aerospike_b

Monitoring

If lap_us is consistently higher than expected alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
latency_ms
warn
Context
xdr
Prometheus Name
aerospike_xdr_latency_ms
Description

Average network latency for the successfully shipped latency. This value does not include timed-out shipment attempts or any other errors. Updated every log ticker interval (10 seconds by default).

Available only at the dc level, not namespace level. Example: asinfo -h localhost -l -v get-stats:context=xdr;dc=aerospike_b

Monitoring

Depending on configuration, latency_ms should be within the latency of the link between the DCs.

If latency_ms increases beyond the expectations based on the distance (or known link latency) between clusters, alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
moving average
Labels
cluster_namejobserviceinstancelongitudelatitudedc
local_recs_migration_retry
optional
Context
xdr
Prometheus Name
aerospike_xdr_local_recs_migration_retry
Description

Number of records missing in a batch call, generally a result of migrations, but can also be caused by expiration and eviction.

Introduced
3.2.7
Removed
6.4
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
nodes
watch
Context
xdr
Prometheus Name
aerospike_xdr_nodes
Description

Number of nodes in the destination DC as seen by XDR. There may be some delay for the remote changes to be reflected in this stat, especially on node departure, as XDR gives some grace period before removing a node.

Introduced
5.3.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
not_found
watch
Context
xdr
Prometheus Name
aerospike_xdr_not_found
Description

Number of local records not found by XDR when attempting to read them. Such records might have been expired, evicted, or deleted.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
queue_overflow_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_queue_overflow_error
Description

Number of XDR queue overflow errors. Typically happens when there are no physical space available on the storage holding the digest log, or if the writes are happening at such a rate that elements are not written fast enough to the digest log. The number of entries this queue can hold is 1 million.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_active_avg_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_active_avg_pct
Description

This statistics reflects how busy the XDR read threads are by calculating, the average time in percent of total time that the XDR read threads spend actually processing something vs. waiting for a new digest log entry to arrive on their queues from the dlogreader / failed node shippers / window shippers.

Introduced
3.9
Removed
5.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_error
Description

Number of read requests initiated by XDR that failed. Those are rare, but if present, would typically be caused by reservation failures (node lost master and/or prole ownership of the partition the record belonged to during migrations). This will cause the record’s digest log entry to be relogged to the node now owning the partition (tracked under relogged_outgoing). Other rare cases would be for example when running out of memory or failure to access the storage layer. For the total number of XDR initiated read requests, sum up the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_idle_avg_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_idle_avg_pct
Description

This is a sister statistic to xdr_read_active_avg_pct and represents the average time in percent of total time that the XDR read threads waits for a new digest log entry to arrive on their queues from the dlogreader / failed node shippers / window shippers.

Introduced
3.9
Removed
5.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_latency_avg
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_latency_avg
Description

Moving average latency in milliseconds for XDR to read a record.

Introduced
3.9
Removed
5.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_notfound
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_notfound
Description

Number of read requests initiated by XDR that were not found. These do not get relogged. This would typically happen if a record is updated and then deleted, but a lag caused the entry to for the record update to be processed after the record has been deleted. For the total number of XDR initiated read requests, sum the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_reqq_used
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_reqq_used
Description

How many digest log entries are currently in the XDR read threads queues. Each XDR read thread has an in-memory queue with a capacity of 1,000 log entries associated with it. See also related statistic xdr_read_reqq_used_pct. When the dlogreader / failed node shipper / window shipper cannot write to a queue, because the queue is full, it blocks, until there’s space in the queue again.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_reqq_used_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_reqq_used_pct
Description

Sister statistic to xdr_read_reqq_used to represent how full in percent the XDR read request queues are.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_respq_used
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_respq_used
Description

How many entries are being used in the XDR read response queues. Those queues are used to hand back records after they have been locally fetched. Those queues are similar to the queues referred to in the xdr_read_reqq_used stat except for the fact that they are not bounded. The throttling would happen at the XDR read request queues.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_success
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_success
Description

Number of read requests initiated by XDR that succeeded. For the total number of XDR initiated read requests, sum up the xdr_read_success, xdr_read_notfound and xdr_read_error statistics.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_txnq_used
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_txnq_used
Description

Number of XDR read commands that are in flight in the local transaction queue. XDR limits to 10,000 the number of outstanding XDR read requests. The requests are placed in an internal transaction queue. See xdr_read_txnq_used_pct for the percent used in this queue.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
read_txnq_used_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_read_txnq_used_pct
Description

Percent used of the XDR read commands that are in flight (out of a maximum allowed of 10,000) in the transaction queue. It is an internal transaction queue. See xdr_read_txnq_used for the number of XDR issued reads that are in flight.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
recoveries
warn
Context
xdr
Prometheus Name
aerospike_xdr_recoveries
Description

Number of partitions that are recovered by reducing the primary index of that partition. Recovery is done when the in-memory transaction queue of the partition is either full or if necessary records are not present in the in-memory transaction queue.

See also recoveries_pending.

Monitoring

If recoveries is consistently increasing alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
recoveries_pending
warn
Context
xdr
Prometheus Name
aerospike_xdr_recoveries_pending
Description

Number of recoveries currently pending.

If recoveries_pending is zero, there are no recoveries in progress. Non-zero indicates the number of recoveries in progress.

Monitoring

If recoveries_pending is unexpectedly increasing alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
relogged_incoming
optional
Context
xdr
Prometheus Name
aerospike_xdr_relogged_incoming
Description

Number of records relogged into this node’s digest log by another node. This typically happens during the following situations:

  • migrations at the source cluster, when there are outstanding digest log entries and the partition ownership changes by the time they are processed, if the local node does not own master or prole copy of the partition such record belongs to, the node now owning the master copy of the partition would get an incoming digest log entry relogged to it.

  • when a node relogs record’s digest log entries to itself (dlog_relogged), it will also relog those for the node owning the prole counterpart.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Detail

The sending node will then have its relogged_outgoing statistic incremented.

relogged_outgoing
optional
Context
xdr
Prometheus Name
aerospike_xdr_relogged_outgoing
Description

Number of records relogged to another node’s digest log. This typically happens during the following situations:
- migrations at the source cluster, when there are outstanding digest log entries for which the local node does not own either master or prole partition for the record anymore (xdr_read_error)
- when a node relogs record’s digest log entries to itself (dlog_relogged), it will also relog those for the node owning the prole counterpart.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Detail

The receiving node will then have its relogged_incoming statistic incremented.

retry_conn_reset
warn
Context
xdr
Prometheus Name
aerospike_xdr_retry_conn_reset
Description

Number of records whose shipment is retried due to a reset of the connection to the remote datacenter. A connection can be reset due to timeouts (10s), network problems, or destination node restarts.

This statistic can increase in bursts. Because of the XDR pipeline, there can be many records that are retried when a connection is reset.

Monitoring

If retry_conn_reset is consistently higher than expected alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
retry_dest
warn
Context
xdr
Prometheus Name
aerospike_xdr_retry_dest
Description

Number of records retried due to a temporary error returned by destination node. The destination node has responded with a specific error code; therefore, such errors are not related to the network. Such errors include key busy and device overload.

Monitoring

If retry_dest is consistently higher than expected alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
retry_no_node
warn
Context
xdr
Prometheus Name
aerospike_xdr_retry_no_node
Description

Number of records retried because XDR cannot determine which destination node is the master.

This typically happens when XDR does not discover the full cluster of the destination, perhaps due to firewall settings. In such a case, the master for all partitions cannot be known. The other possibility is that the entire namespace is not present on the destination cluster.

Monitoring

If retry_no_node is consistently higher than expected alert operations to investigate.

Introduced
5.1
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_bytes
watch
Context
xdr
Prometheus Name
aerospike_xdr_ship_bytes
Description

Estimated number of bytes XDR has shipped to remote clusters.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_compression_avg_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_compression_avg_pct
Description

Used to determine how beneficial compression is (higher is better).

Introduced
3.9
Removed
5.0
Measurement type
moving average
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_delete_success
Context
xdr
Prometheus Name
aerospike_xdr_ship_delete_success
Description

Number of delete operations that were successfully shipped.

Introduced
3.9
Removed
5.0
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_destination_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_destination_error
Description

Number of errors from the remote cluster(s) while shipping records. Errors include timeout, out-of-space, key-busy, etc. Those would be typically relogged, except in case of permanent error (tracked under xdr_ship_destination_permanent_error — for example records too big or some bad namespace configuration), in which case they trigger a WARNING log message at the destination. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_destination_permanent_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_destination_permanent_error
Description

Number of permanent errors from the remote cluster(s) while shipping records. Example errors include records too big or some bad namespace configuration, in which case they trigger a WARNING log message at the destination and will not be relogged. These do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error. For all errors while shipping to a destination, see xdr_ship_destination_error.

Introduced
4.4.0.4
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_fullrecord
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_fullrecord
Description

Number of records that did not take advantage of bin level shipping (see xdr-ship-bins).

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_inflight_objects
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_inflight_objects
Description

Number of objects that are inflight (which have been shipped but for which a response from the remote DC has not yet been received).

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_latency_avg
Context
xdr
Prometheus Name
aerospike_xdr_ship_latency_avg
Description

Moving average latency in milliseconds to ship a record to remote Aerospike clusters. This is computed by dividing time into 1 second intervals.

Monitoring

Depending on configuration, xdr_ship_latency_avg should be within the latency of the link between the DCs.

If xdr_ship_latency_avg increases beyond the expectations based on the distance (or known link latency) between clusters, alert operations to investigate.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Detail

The average is calculated over each 1 second interval separately and then thrown into the exponential moving average. The exponential moving average is actually a moving average of independent 1-second averages. This is done to avoid having some time intervals where there is a much higher volume of transactions having a heavier weight compared to time intervals with much fewer transactions.

ship_outstanding_objects
Context
xdr
Prometheus Name
aerospike_xdr_ship_outstanding_objects
Description

Number of outstanding records not yet processed. This only applies to the main thread and will not account for digest log entries pending window shipper or failed node processing. It represents the difference between the write pointer position and the read pointer position. It also does not account for entries pending in the queue prior to being flushed to the digest log, which can go up to 100 entries or 500ms if not full by that time (configurable through xdr-digestlog-iowait-ms).

Monitoring

Trending xdr_ship_outstanding_objects allows operations insight into how the XDR record transmit queue size changes over time.

Introduced
3.9
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_source_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_source_error
Description

Number of client layer errors while shipping records. Errors include connection errors, bad network fd, etc. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
ship_success
optional
Context
xdr
Prometheus Name
aerospike_xdr_ship_success
Description

Number of records successfully shipped to remote Aerospike clusters (across all datacenters configured, meaning one record successfully shipped to 3 different datacenters will increment this counter by 3). Includes xdr_ship_delete_success. For the total number of records XDR attempted to ship, sum up xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error. Those do not count errors while attempting to read the record locally, but only errors after a record to be shipped has been passed to XDR’s underlying C client. For errors reading records locally, See xdr_read_error.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
stat_pipe_reads_diginfo
optional
Context
xdr
Prometheus Name
aerospike_xdr_stat_pipe_reads_diginfo
Description

Number of digest information read from the named pipe.

Introduced
3.2.7
Removed
6.4
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
success
warn
Context
xdr
Prometheus Name
aerospike_xdr_success
Description

Number of records successfully shipped to remote datacenters.

Monitoring

If success is consistently lower than expected alert operations to investigate.

Introduced
5.0.0
Removed
-
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
throughput
watch
Context
xdr
Prometheus Name
aerospike_xdr_throughput
Description

Number of records successfully shipped per second. Updated every log ticker interval (10 secs by default).

Introduced
5.0.0
Removed
-
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
timelag
optional
Context
xdr
Prometheus Name
aerospike_xdr_timelag
Description

Time in seconds it took the latest shipped record from the moment it was first written at the source until it was attempted to be shipped to the destination cluster. This is equivalent to the time its digestlog entry waited in the digestlog before being processed. Each record written at the source is timestamped as it gets written into the XDR digestlog.

Monitoring

[Removed in 5.0] If xdr_timelag is consistently greater than a few seconds, this condition might indicate network connectivity issues or errors writing at a destination cluster.

The knowledge base article on FAQ - What are the causes of XDR throttling might be helpful.

Introduced
3.8.1
Removed
5.0
Measurement type
gauge
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Detail

When having multiple destination DCs, this represents the maximum time lag across all the remote DCs that are not in the CLUSTER_INACTIVE or CLUSTER_DOWN states (see dc_state). Under normal operations, though, the timelag for each DC that are in the CLUSTER_UP state will be the same, given that XDR ships records in lock-step. The timelag at each DC would be different when a DC is in the CLUSTER_DOWN or in the CLUSTER_WINDOW_SHIP state. This does not represent the time it will take for XDR to ‘catch up’, nor does it necessarily relate to the number of outstanding digests in the digest log still to be processed. For per DC time lag, see dc_timelag.

uncompressed_pct
optional
Context
xdr
Prometheus Name
aerospike_xdr_uncompressed_pct
Description

Running average percentage of records not compressed because they are below the compression threshold (100) or failed to be compressed at all. See also related parameter enable-compression.

Introduced
5.0.0
Removed
-
Measurement type
moving average
Data type
decimal
Labels
cluster_namejobserviceinstancelongitudelatitudedc
uninitialized_destination_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_uninitialized_destination_error
Description

Number of records in the digest log not shipped because the destination cluster has not been initialized for a DC that is configured for a namespace. This should not happen. Those errors are not counted as xdr_ship_*_error.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
unknown_namespace_error
optional
Context
xdr
Prometheus Name
aerospike_xdr_unknown_namespace_error
Description

Number of records in the digest log not shipped because they belong to an unknown namespace, on the source cluster. One situation where this would happen is if a namespace is removed (or the order of namespaces is changed in the configuration) while there are some entries in the digest log not processed yet. This should not happen in most cases. Those errors are not counted as xdr_ship_*_error.

Introduced
3.9
Removed
5.0
Measurement type
counter
Data type
integer
Labels
cluster_namejobserviceinstancelongitudelatitudedc
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?