Skip to content

Release notes for previous Monitoring Stack releases

Monitoring stack 3.0.0

November 9, 2023  |  Download

  • Aerospike Monitoring Stack version 3.0.0 is a major upgrade with improved dashboards designed to be forward and backward compatible with Aerospike versions.
    • Establishes a consistent design pattern with status displayed at the top and detailed time ranges displayed below.
    • All dashboards and alerts are forward compatible to 7.x versions and backwards compatible to 5.x versions.
    • Removed unused and deprecated dashboards (alerts, exporter, and jobs).

Breaking Changes

  • Alert severity is modified to be of type string like critical, error, warn, and info. Earlier number based severity is deprecated.
  • New alerts related to 7.0 metrics, connectors, and bug-fixes are added with string type severity only.
  • Removed the 3 deprecated alerts, exporters and job dashboards.

New Features

  • Add DynaTrace to the OTEL Examples. [OM-116]
    • Added support documentation and example otel-collector configurations on integrating Aerospike metrics with DynaTrace.
  • Node View - Handle 7.0 metric changes. [OM-127]
    • Revamped dashboard according to 7.0 metrics theme and display build version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Data, index and memory metrics are shown as minimum, average and maximum to identify anomalies easily across namespaces.
  • Namespace View - Handle 7.0 metric changes. [OM-128]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Data, index and memory metrics are shown as minimum, average and maximum to identify anomalies easily across all nodes.
  • Unique Data View - Handle 7.0 metric changes. [OM-129]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Displays usage across clusters and historical usage by each cluster.
    • Data is displayed in three layers: 1. all-clusters, 2. single-cluster, 3. by namespace in each cluster.
  • Update Alert Rule to Handle 7.0 metric changes. [OM-130]
    • Enhanced alerts to use 7.x metrics and marked previous alerts with “pre7x” prefix.
    • List of alerts added / modified.
      • Modified - NamespaceDataCloseToStopWrites, LowDataAvailWarning, LowDataAvailCritical.
      • Added - HighDataUseNamespaceWarning, HighDataUseNamespaceCritical.
      • Renamed - pre7x_NamespaceSetQuotaWarning, pre7x_NamespaceSetQuotaAlertCritical.
  • Set Index - Handle 7.0 metric changes. [OM-133]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Data metrics are shown as minimum, average and maximum to identify anomalies easily across all nodes.
  • All Flash - Handle 7.0 metrics. [OM-134]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
  • Rolling Restart Dashboard - Handle 7.0 metric changes. [OM-135]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Data and memory are now showing both top-k and bottom-k, which represents both over-utilized and under-utilized.
  • Cluster view - Handle 7.0 metric changes. [OM-136]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
  • Handle 7.0 - Multi cluster view. [OM-139]
    • Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
    • Topology diagram now shows dashboards and connectors using different diagrams.
  • Remove deprecated job and alerts (old) dashboards. [OM-144]
    • Removed the 2 deprecated jobs, exporter and alerts dashboard. Alerts dashboard is replaced with new alertsview dashboard in previous release.

Bug Fixes

  • Standardized alert severity colors, bug-fix where info alert count now showing correctly. [OM-140]
  • AMS - Change ordering of memory free pct graph on Rolling Restart dashboard. [OM-114]
  • Avoid average function in namespace dashboard. [OM-74]
  • Monitoring dashboard “Namespace” does not show namespace level values. [OM-105]
  • Improve Dashboard Queries and Linting. [OM-109]

Full changelog


Monitoring stack 2.8.0

September 20, 2023  |  Download

  • The v2.8.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
  • This release includes 1 major feature - Connector Dashboard, Alerts and topology.
  • Aerospike Monitoring Stack version 2.8.0 adds 2 dashboard, alerts and bug fixes:
    • 2 dashboards to monitor connectors and connector JVM metrics.
    • Enhanced alerts to cover various aspects of Connector key metric thresholds and JVM health.

New Features

  • Create predefined Prometheus alert rules for Connectors. [OM-64]
    • This release include 6 alerts to cover mandatory functional and process/health of the Connectors.
      • Key alerts covered are connector-status, connector-request-lag, connector-request-errors, jvm heap, jvm cpu and jvm gc.
  • Connectors alerts & Dashboards [OM-56]
    • Connector view dashboard which helps to monitor 6 connectors.
      • Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
      • Key metrics covered are - request lag, request error, success, skipped, connections, xdr record byte size, etc…
  • Create a dashboard for a Connector(s) [OM-107]
    • Connector JVM view dashboard which helps to monitor JVM health of 6 Connectors.
      • Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
      • Key metrics covered are - uptime, cpu, memory, threads, files, classes and buffers.
  • Multi-cluster view dashboard is enhanced to display Aerospike Server topology using the cluster-name and xdr dc configurations.

Bug Fixes

  • Avoid duplicate defrag metric values on the namespace dashboard. [OM-122]
  • Namespace view dashboard - average objects per sprig stat. [OM-113]
  • Add high-water mark breached to the Rolling Restart dashboard. [OM-120]

Full changelog


Monitoring stack 2.7.0

August 28, 2023  |  Download

  • The v2.7.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
  • This release includes 2 major features - Enhanced Alerts and All Flash use-case dashboard.
  • Aerospike Monitoring Stack version 2.7.0 adds new dashboard and bug fixes:
    • All Flash dashboard, various key metrics which should be monitored while working with flash storage at both index and sindex.
    • Enhanced alerts to cover various aspects of server metrics, this release covers alerts on Namespaces, XDR, Latencies, Best checks, Node-exporter etc…
  • Aerospike Prometheus exporter 1.13.0 or greater must be used to get the Aerospike 6.4 metrics.

New Features

  • Add new XDR bytes-shipped metrics to dashboards. [OM-104]
    • Display bytes-shipped both as stat and time-series which can help monitoring the replication progress.
  • Observability & Management Alerts - Enhance / enrich prometheus alerts from ACMS. [OM-98]
    • This release includes 40 alerts covering various metrics of Aerospike Server, some key areas are:
      • Namespaces, Latencies, data replication (xdr), set, node-exporter, flash , best checks etc…
  • Use-case Dashboard: all-flash. [OM-93]
    • A new use-case dashboard is introduced in this release, this dashboard focuses mainly on key metrics and alerts related to flash usage.
      • Some key metrics are average-objects per sprig, index-pressure, primary index flash and secondary index flash etc…
  • Use-case Dashboard Organization & Naming. [OM-48]
    • Added brief descriptions on each dashboard and updated tags to identify each dashboard easily.
  • Observability dashboard unit tests. [OM-111]
    • Created a framework to test our dashboard automatically including panels, expression / queries, layout and expression results.
  • Add user stat related alerts. [OM-103]
    • Added user stat specific alerts covering connections, connection churn etc…
  • Add warning for best practice failures. [OM-101]
    • Alerts if best-practices are not followed while setting up the Aerospike server, this flag is sent by the server after a series of checks.
  • Add warning for node-exporter not being present. [OM-102]
    • As a precursor to integrate node-exporter metrics into Aerospike Monitoring stack, this alert is introduced if node-exporter is not configured, raising a warning alert in the Alerts View dashboard.

Full changelog


Monitoring stack 2.6.1

August 3, 2023  |  Download

  • The v2.6.1 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
  • Aerospike Monitoring Stack version 2.6.1 adds bug fixes.
  • Aerospike Prometheus exporter 1.12.0 or greater must be used to get the Aerospike 6.3 metrics.
  • Deprecated
    • Existing Alerts dashboard is deprecated and will be removed in future releases.
    • Existing Jobs dashboard is deprecated and will be removed in future releases.

Bug Fixes

  • Issues in Multi-cluster view dashboard [OM-100]
    • Corrected label and unit in XDR panel.
    • Corrected links from XDR and Latencies to respective dashboards (instead of cluster-view).
    • Added a alert-severity based filter.
  • Issues in Alerts view
    • Panel colors are corrected according to the severity types.
  • Issues in Unique Data view
    • Unique data bytes are not shown correctly when custom labels are enabled in configuration.
    • Added historical time-series for unique data-bytes data point.

Full changelog


Monitoring stack 2.6.0

July 12, 2023  |  Download

  • The v2.6.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
  • This release eliminates instances of hard coded values for variables. As a result, the user needs to ensure that the Aerospike Prometheus data source is selected as a default in order for dashboard data to populate correctly.
  • Aerospike Monitoring Stack version 2.6.0 adds the new dashboard and bug fixes.
    • Rolling restarts dashboard, various key metrics which should be monitored during specific use cases.
    • Alerts View dashboard, adopting more meaningful alert severity levels.
  • Aerospike Prometheus exporter 1.12.0 or greater must be used to get the Aerospike 6.3 metrics.
  • Deprecated
    • Existing Alerts dashboard is deprecated and will be removed in future releases.
    • Existing Jobs dashboard is deprecated and will be removed in future releases.

New Features

  • Rolling Restarts dashboard, data is shown in group like stats, error and resources. This dashboard curates various key metrics which should be monitored during specific use cases, like node restart, software upgrade, investigation, etc. Resource utilization is displayed for the TopK major consumers at a service and namespace level. [OM-79]
  • Added the new Alerts view dashboard. This visualizes alerts according to the severity as count and each alert. Newly adopted alert levels in decreasing order: critical, error, warn and info. This dashboard replaces the existing Alerts dashboard. [OM-85]
  • All Aerospike dashboards and panel visualizations are modified according to the Grafana 9.x version. [OM-82]
  • Improved and reorganized Aerospike Monitoring stack examples: [OM-49]
    • Reorganized docker compose file in relevant folder.
    • Added examples on how to use AeroLab which can spin up Aerospike clusters per Proof of Concept (POC) needs.

Bug Fixes

  • Includes bug fixes related to queries and visualizations: [OM-82]
    • All queries now include proper regex pattern to honor single or multiple value template variable selection.
    • All Time-Series are adjusted to use range vector.
    • All dashboard have standardized template variable and same order.

Full changelog


Monitoring stack 2.5.0

June 19, 2023  |  Download

  • Aerospike Monitoring Stack version 2.5.0 adds the new Multi Cluster view dashboard, Otel integration examples and bug fixes

New Features

  • Added the new Multi cluster view dashboard. This visualizes multiple clusters across regions and data centers with a focus on health. This dashboard consists of 4 panels. [OM-45]
    • Geomap panel - displays multiple cluster view.
    • Cluster panel - displays key metrics like size, alerts, XDR lag, Read & Write latencies.
    • Node panel - uses the Polystat plugin and displays nodes in Green or Red indicating the health.
    • Namespace panel - displays namespaces in Green or Red indicating the health.
    • Key metrics used in this dashboard:
      • aerospike_node_up
      • aerospike_namespace_objects
      • aerospike_node_stats_cluster_size
      • aerospike_xdr_lag
      • aerospike_latencies_write_ms_bucket
      • aerospike_latencies_read_ms_bucket
  • Added new examples on how to integrate Aerospike prometheus exporter with the Otel collector and export metrics to a partner solution. [OM-60]
    • Partner integration examples are provided for NewRelic, Datadog and Cloudwatch.

Bug Fixes

  • In the Namespace dashboard, the Defrag row hides anomalies as a result of aggregation. [OM-76]
    • Removed the Defrag row, as aggregation is removed and moved from the defrag panels to the namespace row to display defrag metrics for each namespace.

Full changelog


Monitoring stack 2.4.0

May 16, 2023  |  Download

  • Aerospike Monitoring Stack version 2.4.0 adds support for metrics introduced in Aerospike 6.3.

New Features

Bug Fixes

Full changelog


Monitoring stack 2.3.1

April 19, 2023  |  Download

Bug Fixes

  • Issues in Set view, Unique data view, Sindex view, Namespace view and Node view: [OM-37]
    • Fixed issue in “Set view” dashboard to remove hardcoded datasource.
    • Re-exported Set view, Unique data view, Sindex view, Namespace view and Node view dashboards with right configurations so they are suitable to be made available in Grafana Cloud.

Full changelog


Monitoring stack 2.3.0

April 3, 2023  |  Download

  • Aerospike Monitoring Stack version 2.3.0 adds support for metrics introduced in Aerospike 6.3.

New Features

  • Added 6.3 metrics:
    • Adds aerospike_sindex_used_bytes secondary index metric.
    • Adds aerospike_namespace_nsup_cycle_deleted_pct NSUP metric.
    • Adds aerospike_sets_stop_writes_size set level configuration.
  • Updated memory used panel in secondary index to consider aerospike_sindex_used_bytes or aerospike_sindex_memory_used as aerospike_sindex_memory_used is deprecated in Aerospike 6.3.
  • Added nsup metrics panel to Namespace view dashboard.
  • Added set level quotas panel to Namespace view dashboard.
  • Added a new dashboard displaying set level metrics.
  • Added a new dashboard displaying unique data usage.
  • Added 4 new prometheus alerts:
    • NamespaceSupervisorFallingBehind when NSUP is falling behind and/or display the length of time the most recent NSUP cycle lasted.
    • NamespaceFreeMemoryCloseToStopWrites when one of your Aerospike nodes memory is close to the stop writes limit configured for a namespace.
    • NamespaceSetQuotaWarning when one of your Aerospike nodes is at 80% of the quota you have configured on a set.
    • NamespaceSetQuotaAlert when one of your Aerospike nodes is at 99% of the quota you have configured on a set.

Full changelog


Monitoring stack 2.2.0

August 26, 2022  |  Download

  • Aerospike Monitoring Stack version 2.2.0 adds support for metrics introduced in Aerospike 6.1.

New Features

  • Add server 6.1 metrics. [TOOLS-2087]
    • Adds aerospike_xdr_bytes_shipped.
    • Adds aerospike_sindex_entries_per_bval.
    • Adds aerospike_sindex_entries_per_rec.
  • Replace latency panels with heat map and percentiles. [TOOLS-2132]

Full changelog


Monitoring stack 2.1.0

July 19, 2022  |  Download

  • Aerospike Monitoring Stack version 2.1.0 adds support for the batch-index latency metrics aerospike_latencies_batch_index_us_bucket and aerospike_latencies_batch_index_us_count.

New Features

  • Add batch-index latency panels. [TOOLS-2069]

Full changelog


Monitoring stack 2.0.0

June 10, 2022  |  Download

  • Aerospike Monitoring version 2.0.0 adds support for many new Aerospike 6.0 metrics in the Grafana dashboards, like the following:
    • Primary index queries.
    • Secondary Index queries.
    • Batch sub transactions. (non proxied)
    • Add overall reads/writes (client_read/write_success + batch_sub_read/write_success) to cluster, node, and namespace dashboards.
    • New job information such as job type.
    • si-query and pi-query latencies.
    • Add memory_used stats to SIndex dashboard, remove the many SIndex metrics dropped in Aerospike Server version 6.0.
    • Remove any mention of scans.
    • Other miscellaneous changes. See pull request 33 for more details.

New Features

  • Display Aerospike 6 metrics. [TOOLS-2044]

Full changelog


Monitoring stack 1.4.0

March 14, 2022  |  Download

New Features

  • Add Jobs View and Secondary Index View dashboards [TOOLS-1956]
    • Add support for per-job scan and query statistics [TOOLS-1946]
    • Add support for secondary index statistics [TOOLS-1947]

Full changelog


Monitoring stack 1.3.2

September 7, 2021  |  Download

Improvements

  • Add new metrics introduced in Aerospike 5.7. [TOOLS-1785]

Full changelog


Monitoring stack 1.3.1

June 15, 2021  |  Download

Improvements

  • Adds “Exporters View” dashboard to track status of all Aerospike Prometheus Exporter targets.

Bug Fixes

  • Fixes incorrect status of the exporters and Aerospike nodes in the “Node View” dashboard. [TOOLS-1721]

Full changelog


Monitoring stack 1.3.0

June 4, 2021  |  Download

New Features

  • Add support for user statistics introduced in Aerospike Server version 5.6. [TOOLS-1715]

Improvements

Bug Fixes

  • Fixed 90th percentile latency computation in Latency View dashboard to not use rate(). Thanks to @ashangit for the contribution.

Full changelog


Monitoring stack 1.2.1

January 27, 2021  |  Download

Improvements

  • Added DC nodes metric to XDR dashboard.

Full changelog


Monitoring stack 1.2.0

November 16, 2020  |  Download

New Features

  • Migrate dashboards to Grafana 7. [TOOLS-1589]

Improvements

  • Make datasource configurable through a dashboard variable. Thanks to realmgic (Zohar) for the contribution. [TOOLS-1591]
  • Alert when ‘close to’ stop writes, when node is proxying and when XDR lag is above a threshold. [TOOLS-1588]
  • Add Prometheus’ docker swarm service discovery config to the example. [TOOLS-1590]

Bug Fixes

  • Fix units for “Failure rate” panel in Namespace view. [TOOLS-1592]

Full changelog


Monitoring stack 1.1.1

August 31, 2020  |  Download

Improvements

  • Use latency time unit in queries to support Aerospike’s microsecond histograms. Add variable for latency time unit to Latency View and Node Overview dashboards.

Bug Fixes

  • Refresh variables on time range change.

Full changelog


Monitoring stack 1.1.0

July 27, 2020  |  Download

New Features

  • Add description info to each dashboard panel.
  • Add clock_skew_stop_writes to Namespace View and Cluster View dashboards.
  • Add dashboard support for the new latency metrics change in Aerospike Prometheus Exporter v1.1.0.
  • Show primary index usage for namespaces using index-type flash or pmem.

Improvements

  • Remove aerospike_node_info metric as per Aerospike Prometheus Exporter v1.1.0.
  • Increase default Grafana refresh rate to 1m.

Bug Fixes

  • Fix primary index usage panel to show values in MiB/GiB.

Full changelog


Monitoring stack 1.0.0

July 27, 2020  |  Download

  • Initial release.
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?