Management and Metrics API
The Management and Metrics API is enabled by default for all HTTP source (outbound) connectors, and is enabled for source connectors compatible with Aerospike 5.0+ and JMS Inbound 2.0+, when the manage
subsection in the service
section is configured in the connector's configuration file.
For an HTTP source connector, the service port and manage port are the same.
GET manage/rest/v1/metricsโ
Returns JSON of all the metrics of the connector.
The source connector exposes several metrics. We have only highlighted some of the important ones here.
Countersโ
A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
Name | Description | From Connector version |
---|---|---|
connections-active | Count of active XDR connections to the source (outbound) connector. | 4.0.0 |
connections-closed | Count of closed XDR connections to the source (outbound) connector. | 4.0.0 |
requests-errors | Count of requests that could not be written to the external Kafka system. Includes all errors: permanent (as skip and parse errors) as well as retry-able (as timeout or most other backend returned error). | 3.0.0 |
requests-parse-error | Count of XDR records which could not be parsed. Parse errors are permanent and are not retried. There will be no traces of those failures on the backend system. | 3.0.0 |
requests-skipped | Count of XDR records which were skipped and not written to the external Kafka system. Skip errors are permanent and will not be retried. There will be no traces of those failures on the backend system. | 3.0.0 |
requests-success | Count of records written to the external Kafka system. | 3.0.0 |
requests-queued | Current count of records being executed by the outbound connector. | 4.0.0 |
kafka-producers-active | Count of active Kafka producers. | 4.0.0 |
kafka-producers-closed | Count of closed Kafka producers. | 4.0.0 |
Timersโ
Timers time the processing of tasks. A timer metric aggregates timing durations and provides duration statistics, plus throughput statistics. It provides the total count of the metric, maximum time, minimum time, mean time, times at various quantiles (99%, 95%, etc), and mean throughput value and one-minute, five-minute, and fifteen-minute moving average throughput values.
Name | Description | From Connector version |
---|---|---|
record-ack-queue | Time spent by a processed XDR record in the ack-queue waiting to be acknowledged to Aerospike XDR source. | 4.0.0 |
record-dispatch | Time spent dispatching an XDR record to the external Kafka system. | 4.0.0 |
record-parsing | Time spent parsing an XDR record. | 4.0.0 |
requests-total | Total time a XDR record has spent in the connector. | 3.0.0 |
Histogramsโ
Histograms calculate the distribution of a value. It provides the total count of the metric, maximum value, minimum value, mean value, and values at various quantiles (99%, 95%, etc).
Name | Description | From Connector version |
---|---|---|
record-payload-size | Size of records received by the connector in Bytes. | 4.0.0 |
Metersโ
Meters measure the rate of events in the source connector. It provides the total count of the metric, mean value and one-minute, five-minute, and fifteen-minute moving average values.
Name | Description | From Connector version |
---|---|---|
ordering-error | Count of records rejected to maintain an ordering guarantee of records having the same Aerospike key. XDR retries such a records. | 5.0.0 |
Gaugeโ
A gauge metric is an instantaneous reading of a particular value such as queue's depth.
Name | Description | From Connector version |
---|---|---|
requests-lag | The lag between update of a record in Aerospike and it reaching the source (outbound). Output in milliseconds. Calculated as aerospike_record.lut - current_time . It's sampled every second based on the XDR record being processed at that time, it is approximate and not exact. | 4.0.0 |
Example responseโ
{
"counters": {
"connections-active": {
"count": 10
},
"connections-closed": {
"count": 48
},
"requests-queued": {
"count": 10
},
"requests-error": {
"count": 0
},
"requests-parse-error": {
"count": 0
},
"requests-skipped": {
"count": 0
},
"requests-success": {
"count": 498
}
},
"timers": {
"record-ack-queue": {
"count": 498,
"max": 2051.907832,
"mean": 1163.2236708686287,
"min": 0.16272399999999998,
"p50": 1220.575885,
"p75": 1923.971672,
"p95": 1996.4796489999999,
"p98": 2038.0708619999998,
"p99": 2048.850305,
"p999": 2051.907832,
"stddev": 664.955034643945,
"m15_rate": 0.4961578086196813,
"m1_rate": 1.6175162336487592,
"m5_rate": 1.196781946695958,
"mean_rate": 4.064848866969468,
"duration_units": "milliseconds",
"rate_units": "calls/second"
},
"record-dispatch": {
"count": 498,
"max": 4553.8197629999995,
"mean": 2715.295325838721,
"min": 2500.492921,
"p50": 2600.6392769999998,
"p75": 2643.216744,
"p95": 3476.529067,
"p98": 3813.2396289999997,
"p99": 3822.618782,
"p999": 4553.8197629999995,
"stddev": 330.19447171921576,
"m15_rate": 0.49407536855720546,
"m1_rate": 1.518561623864363,
"m5_rate": 1.181774249073358,
"mean_rate": 4.064825124292382,
"duration_units": "milliseconds",
"rate_units": "calls/second"
},
"record-parsing": {
"count": 498,
"max": 57.421721999999995,
"mean": 1.4433150983267966,
"min": 0.16609,
"p50": 0.33832799999999996,
"p75": 0.395851,
"p95": 1.013417,
"p98": 17.077586,
"p99": 56.833085,
"p999": 57.421721999999995,
"stddev": 7.137755816474087,
"m15_rate": 0.49377156430405933,
"m1_rate": 1.5041252749568248,
"m5_rate": 1.179584797165551,
"mean_rate": 4.064802337516862,
"duration_units": "milliseconds",
"rate_units": "calls/second"
},
"requests-total": {
"count": 498,
"max": 4731.708243,
"mean": 4033.1591903548174,
"min": 2776.8367359999997,
"p50": 4004.654642,
"p75": 4680.3988739999995,
"p95": 4727.156959,
"p98": 4729.64192,
"p99": 4730.605834,
"p999": 4731.708243,
"stddev": 603.790753932718,
"m15_rate": 0.4961578086196813,
"m1_rate": 1.6175162336487592,
"m5_rate": 1.196781946695958,
"mean_rate": 4.06477401807955,
"duration_units": "milliseconds",
"rate_units": "calls/second"
}
},
"gauges": {
"requests-lag": {
"value": 6486
}
},
"histograms": {
"record-size": {
"count": 506,
"max": 100,
"mean": 99.99999999999916,
"min": 100,
"p50": 100,
"p75": 100,
"p95": 100,
"p98": 100,
"p99": 100,
"p999": 100,
"stddev": 8.384404281969126e-13
}
}
}
GET manage/rest/v1/loggingโ
Returns the logging configuration of the connect server.
Example responseโ
{
"file": "/var/log/aerospike-xdr-proxy/aerospike-xdr-proxy.log",
"max-history": 30,
"levels": {
"io.netty": "info",
"ROOT": "info",
"org.reflections": "error",
"org.xnio": "error",
"org.jboss": "off",
"com.aerospike.connect.outbound.server.manage.app.ManageExceptionMapper": "info",
"org.glassfish": "info",
"io.undertow": "info"
},
"rolling-file-pattern": "%d{yyyy-MM-dd}",
"enable-console-logging": true,
"log-pattern": "%date{yyyy-MM-dd HH:mm:ss.SSS z__TZ__} %-5level %logger{0} - %msg%n%ex",
"ticker-interval": 1
}
POST manage/rest/v1/logging/{loggerName}โ
Set the log level for a logger identified by name.
Exampleโ
curl -X POST --data "ERROR" -H 'Content-Type: text/plain' http://localhost:8902/manage/rest/v1/logging/io.netty
GET manage/rest/v1/metrics/errorsโ
Returns the errors encountered within the current window of metric logging. Current window is the ticker-interval
value set in the logging
config section.
Exampleโ
{
"counters": {
"com.aerospike.client.AerospikeException$Connection - Error -8: Failed to connect to host(s): \n 192.168.1.131 32858 Error -8: java.net.SocketException: Connection reset\n": {
"count": 2
}
}
}