Understanding Metrics: Capacity vs. Utilization
Aerospike Cloud makes selected database metrics available to business stakeholders and developers. These metrics include information necessary to design, troubleshoot, and test applications against the database, including information about the health and status of the database.
Health and Status
You can view the state and status of each database in your Cloud account by clicking the Databases link in the left-side navigation of the Cloud console.
A healthy and available database means the database is online and able to execute database transactions. The following is a table of database states:
Icon color | State | Status | Meaning |
---|---|---|---|
Green | ONLINE | Running | All systems running normally |
Green | ONLINE | Updating | A zero-downtime update is in progress |
Red | OFFLINE | Stopped | Database has not yet been provisioned or was shut down |
Red | OFFLINE | Creating | Database is being created |
Red | OFFLINE | Configuring | Software/service configuration for a new database is in progress |
Red | OFFLINE | Stopping | A database is in the process of being stopped |
Red | UNKNOWN | Status not available | Unrecoverable error condition and/or the state is otherwise not available |
Utilization
The utilization charts are available in the Overview tab of each individual database in your Cloud account. To access the utilization charts, select Databases in the left-side navigation of the Cloud console, then select a database.
The Overview tab of each database contains charts with the following metrics:
- Read TPS (transactions per second)
- Write TPS
- Storage Bytes
- Storage Objects
Utilization is a representation of used vs. available resources of the database. The three dimensions to utilization are throughput, data, and system load. If any of the three dimensions exceed a defined threshold the database must be scaled up, or the data/workload reduced.
Throughput utilization represents database operations per second vs. a maximum for a given workload. It consists of read and write transactions per second (TPS).
Read throughput is the combination of key-value reads and batch reads, and the maximum value is based on the maximum database capabilities benchmarked with a 1 KiB object.
Write throughput is the combination of key-value writes and batch writes, and the maximum value is based on the observed average object size.
Data utilization represents the number of objects and the size of the stored data. It includes storage overhead from data types, bin names, and so on, but does include replicated data.
You will receive alerts if your workload exceeds defined thresholds in Aerospike Cloud.
Metrics
To access the Cloud Metrics charts, select the Metrics tab.
Transactions - Last 24 hours
Successes: The number of successes for Read, Write, BatchRead, BatchWrite, and Query operations can help developer validate the transactions reported by their application.
Error Rates: The number of errors occurring per second (by type of error) are critical for developers to validate their applications.
Transaction Latency - Last hour: The latency metric is critical both for stakeholders to understand latency in the context of business service level agreements (SLAs) for their applications, as well as for developers to understand how their data model and/or usage patterns are performing. The charts report latency on read and write transactions for the 95th percentile, the 99th percentile and the 99.9th percentile.