Blog

Troubleshoot smarter with Aerospike logs and metrics

Discover how Aerospike's logs and metrics provide deep operational visibility. Learn to troubleshoot faster and optimize real-time systems with Prometheus, Loki, and Splunk.

April 30, 2025 | 5 min read
1 Tn 3jdT-jbgSt6L3hCEBmA@2x
Taylor Graham
Director Product Management

Modern operations teams don’t have time for guesswork. In high-performance, distributed systems, logs and metrics aren’t just telemetry—they’re how you see what’s happening and why.

In this guide, we’ll show how Aerospike delivers deep operational visibility through native logs, real-time metrics, and seamless integration with tools like Prometheus, Loki, and Splunk. These best practices enable users to troubleshoot faster, optimize smarter, and keep real-time systems healthy at scale. We’ll also highlight new capabilities that help teams fine-tune visibility without sacrificing performance.

We’ll cover what Aerospike provides for both logs and metrics: what's built into the database, how to configure and interpret logs, and how to extend visibility with external tools. We’ll conclude with a hands-on setup walkthrough you can try yourself, or skip ahead here.

How Aerospike handles logs, metrics, and integrations

Aerospike emits two key types of telemetry data:

  • Logs: Native, file-based log messages from the database engine, categorized by severity (INFO, DEBUG, etc.) and source component.

  • Metrics: Aerospike Prometheus Exporter (APE) exposes metrics at a /metrics HTTP endpoint, designed to be scraped by monitoring systems like Prometheus.

External integrations include:

  • Prometheus: A metrics collection system commonly paired with Grafana. It provides real-time dashboards, alerts, and historical trends.

troubleshoot-smarter-with-aerospike-logs-and-metrics-prometheus-dashboard
Figure 1: Prometheus dashboard
  • Loki or Splunk: External log aggregation tools that ingest logs from many systems, including Aerospike, and make them searchable, filterable, and easier to analyze over time.

troubleshoot-smarter-with-aerospike-logs-and-metrics-loki-dashboard
Figure 2: Loki dashboard

These tools are designed to complement Aerospike’s built-in telemetry. Prometheus can consume Aerospike metrics, and tools like Loki can ingest Aerospike logs as part of a broader observability stack.

What logging and metrics contribute

Logs and metrics serve different but complementary roles. Here’s a quick guide to what each offers:

Editable Diagrams (5)

The power of using logging and metrics in unison

While logs and metrics serve different roles, using them together gives you a clearer operational picture.

Example: Imagine you run a real-time e-commerce platform that personalizes content based on recent user behavior. One evening, Prometheus triggers an alert: write latencies are spiking in one of your Aerospike clusters. This could delay personalized content delivery, which directly affects conversion rates.

You pivot to Loki, where you find a burst of defragmentation activity logged by the SSD subsystem. Now you have a root cause and a fix. Without both telemetry signals working together, you’d be guessing.

Logs tell you what happened. Metrics tell you when and how often. Used together, they tell you why.

Five signs you've outgrown Couchbase

As your business grows, the database solution that once felt like the perfect fit can start showing its limits. If you’re a Couchbase user, you might be experiencing escalating costs, inconsistent performance, or challenges with scalability that disrupt SLAs and business growth–clear signs that it’s time for a change.

Managing logs for real-world operations

Aerospike provides a robust native logging system that helps users track database activities, diagnose issues, and ensure operational stability. Logs are structured and categorized by both event level and severity, making it easier to filter and analyze relevant information.

Aerospike includes a native logging framework that emits structured log lines with component-level detail.

Log event levels (configured via the context field in aerospike.conf):

  • INFO: Routine operational messages (e.g., startup, configuration changes)

  • DEBUG: Useful for troubleshooting specific issues during development or investigation

  • DETAIL: Records a large volume of granular data, including keys of records being read and written. Use with caution in production due to performance and storage overhead

Log severity levels (appear in the log output itself):

  • INFO: General operational status

  • WARNING: Potential issues that may require attention

  • ERROR: Operational problems that may affect stability

  • CRITICAL: Severe issues needing immediate action

The log levels are configured via the logging stanza in aerospike.conf. For example:

*** normal logging ***
logging {
  console {
    context any info
     }
}


*** debug example ***
logging {
  console {
    context any debug  
}
}

Explore our Configure log streams documentation for more logging setup and configuration information. 

Aerospike now also provides a dashboard alert if a cluster is left in DEBUG or DETAIL mode, a common cause of unnecessary load and disk use. Leaving verbose logging on in production environments can significantly degrade performance and flood logs with unnecessary data. 

While both DEBUG and DETAIL can be extremely useful during investigation or development, they should be enabled selectively and temporarily. For most users, the goal is to collect just enough detail to diagnose an issue, then promptly return to less verbose levels like INFO. Leaving it on longer than necessary can lead to unnecessary load and disk usage.

Anatomy of a log line

Aerospike logs are highly structured to make parsing, searching, and alerting easier. Here’s an example log entry and how to read it:

2025-03-13 09:20:41.913Mar 13 2025 13:20:41 GMT: INFO (drv_ssd): (drv_ssd.c:2035) {test} /opt/aerospike/data/test.dat: used-bytes 192 free-wblocks 4087 write-q 0 write (0,0.0) defrag-q 0 defrag-read (1,0.0) defrag-write (0,0.0)
  • Timestamp: Two versions (local and GMT)

  • Log level: INFO

  • Component: drv_ssd (Aerospike SSD driver)

  • Source file and line: drv_ssd.c:2035

  • Namespace: {test}

  • Data file: Path to SSD file

  • Stats: Used bytes, free write blocks, write and defrag queues, and latencies

This specific log shows a healthy SSD subsystem with no write backlog and minimal defrag activity.

How Aerospike logging and metrics work together to improve operations

Aerospike’s native logging provides deep visibility into what the database is doing at any given moment. It complements monitoring tools like Prometheus, which provide high-level insights and alerts. For most production teams, the ideal setup is to export Aerospike logs to a tool like Loki for detailed investigation, while using Prometheus for continuous monitoring.

By understanding how and when to use each telemetry type, teams can troubleshoot faster, maintain system health, and avoid outages caused by misconfigurations or missed warning signs.

Try it yourself

Spin up a full monitoring stack—Aerospike, Prometheus, Grafana, and Loki—using Docker Compose in minutes. Get started with the quick start lab.

Try it out and let us know what you think, especially if you catch an issue on day one or have additional requirements such as client-level logs or metrics. You can reach out through your Aerospike team.

Try Aerospike: Community or Enterprise Edition

Aerospike offers two editions to fit your needs:

Community Edition (CE)

  • A free, open-source version of Aerospike Server with the same high-performance core and developer API as our Enterprise Edition. No sign-up required.

Enterprise & Standard Editions

  • Advanced features, security, and enterprise-grade support for mission-critical applications. Available as a package for various Linux distributions. Registration required.