Aerospike Benchmark (asbench)

The Aerospike Benchmark tool is a C-based tool that measures the performance of an Aerospike cluster. It can mimic real-world workloads with configurable record structures, access patterns, and UDF calls.

info

You can use any port between 1024 and 65535 for Aerospike, as long as the port is not in use by an existing process.

Usage

The --help option of asbench gives an overview of all supported command line options. The -V or --version option prints the current version of asbench.

asbench --help

Connection options

Option	Default	Description
`-h` or `--hosts HOST_1:TLSNAME_1:PORT_1[,...]`	127.0.0.1	List of seed hosts. TLSNAME is only used when connecting with a secure TLS-enabled server. If the port is not specified, the default port is used. IPv6 addresses must be enclosed in square brackets.
`-p` or `--port PORT`	3000	Default port on which to connect to Aerospike.
`-U` or `--user USER`		User name. This is mandatory if security is enabled on the server.
`-P`		User's password for Aerospike servers that require authentication. The system responds by asking you to enter the password.
`--auth MODE`	`INTERNAL`	Set the authentication mode when user/password is defined. Replace `MODE` with one of the following: `INTERNAL`, `EXTERNAL`, `EXTERNAL_INSECURE`, or `PKI`. This mode must be set to `EXTERNAL` when using LDAP.
`-tls` or `--tls-enable`	disabled	Use TLS/SSL sockets.
`--services-alternate`	false	Use to connect to `alternate-access-address` when the cluster nodes publish IP addresses through `access-address` which are not accessible over WAN, or to connect to alternate IP addresses accessible over WAN through `alternate-access-address`.

TLS options

Option	Default	Description
`--tls-cafile=TLS_CAFILE`		Path to a trusted CA certificate file.
`--tls-capath=TLS_CAPATH`		Path to a directory of trusted CA certificates.
`--tls-name=TLS_NAME`		Default TLS name used to authenticate each TLS socket connection. This must match the cluster name.
`--tls-protocols=TLS_PROTOCOLS`	`-all +TLSv1.2` if the connection supports TLSv1.2. `-all +TLSv1` if it doesn't.	TLS protocol selection criteria. Uses Apache's SSLProtocol format.
`--tls-cipher-suite=TLS_CIPHER_SUITE`		TLS cipher selection criteria. Uses Open_SSL's Cipher List Format.
`--tls-keyfile=TLS_KEYFILE`		Path to the key for mutual authentication, if the Aerospike cluster supports it.
`--tls-keyfile-password=TLS_KEYFILE_PASSWORD`		Password to load a protected TLS keyfile. Replace TLS_KEYFILE_PASSWORD with one of the following: - An environment variable `env:VAR`. - The path to a file `file:PATH`. - A string: `PASSWORD`. If `--tls-keyfile-password` is specified and no password is provided, the system responds by asking you to enter the user's password.
`--tls-certfile=TLS_CERTFILE PATH`		Path to the chain file for mutual authentication, if the Aerospike cluster supports it).
`--tls-cert-blacklist PATH`		Path to a certificate blacklist file. The file should contain one line for each blacklisted certificate. Each line starts with the certificate serial number expressed in hexadecimal format. Serial numbers are only required to be unique per issuer. Each entry may optionally specify the issuer name of the certificate. Example: `86EC7A484 /C=US/ST=CA/O=Acme/OU=Eng/CN=TestChainCA`
`--tls-crl-check`		Enable CRL checking for leaf certificate. An error occurs if a valid CRL file cannot be found in `tls_capath`.
`--tls-crl-check-all`		Enable CRL checking for the entire certificate chain. An error occurs if a valid CRL file cannot be found in `tls_capath`.
`--tls-log-session-info`		Log TLS connected session info.
`--tls-login-only`		Use TLS for node login only.

The TLS name is only used when connecting with a secure TLS-enabled server.

The following example runs the default benchmark on a cluster of nodes 1.2.3.4 and 5.6.7.8 using:

The default Aerospike port of 3000.
TLS configured.
Namespace TEST.

asbench --hosts 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace TEST --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem

Global options

Option	Default	Description
`-z` or `--threads THREAD_COUNT`	16	Number of threads used to perform synchronous read/write commands. Replace `THREAD_COUNT` with the number of threads.
`--compress`	disabled	Enable binary data compression through the Aerospike client. Internally, this sets the compression policy to true.
`--socket-timeout MS`	30000	Read/Write socket timeout in milliseconds. Replace `MS` with the number of milliseconds.
`--read-socket-timeout MS`	30000	Read socket timeout in milliseconds. Replace `MS` with the number of milliseconds.
`--write-socket-timeout MS`	30000	Write socket timeout in milliseconds. Replace `MS` with the number of milliseconds.
`-T` or `--timeout MS`	0	Read/Write total timeout in milliseconds. Replace `MS` with the number of milliseconds.
`--read-timeout MS`	0	Read total timeout in milliseconds. Replace `MS` with the number of milliseconds.
`--write-timeout MS`	0	Write total timeout in milliseconds. Replace `MS` with the number of milliseconds.
`--max-retries RETRIES_COUNT`	1	Maximum number of retries before aborting the current transaction. Replace `RETRIES_COUNT` with the number of retries.
`-d` or `--debug`	disabled	Run benchmark in debug mode.
`-S` or `--shared`	disabled	Use shared memory cluster tending.
`-C` or `--replica REPLICA_TYPE`	master	Which replica to use for reads. Replace `REPLICA_TYPE` with one of the following: - `master`: Always use the node containing the master partition. - `any`: Distribute reads across master and proles in round-robin fashion. - `sequence`: Always try master first. If master fails, try proles in sequence. - `preferRack`: Always try node on the same rack as the benchmark first. If no nodes on the same rack, use sequence. This option requires you to set `rack-id`.
`--rack-id RACK_ID`		Which rack this instance of the asbench resides on. Required with replica policy `prefer-rack`. Replace `RACK_ID` with the rack's ID.
`-N` or `--read-mode-ap READ_MODE`	one	Read mode for AP (availability) namespaces. Replace `READ_MODE` with either `one` or `all`.
`-B` or `--read-mode-sc SC_READ_MODE`	session	Read mode for SC (strong consistency) namespaces. Replace `SC_READ_MODE` with one of the following: `session`, `linearize`, `allowReplica`, or `allowUnavailable`.
`-M` or `--commit-level LEVEL`	all	Write commit guarantee level. Replace `LEVEL` with either `all` or `master`.
`-Y` or `--conn-pools-per-node POOLS_COUNT`	1	Number of connection pools per node. Replace `POOLS_COUNT` with the number of connection pools.
`-D` or `--durable-delete`	disabled	All transactions set the durable-delete flag which indicates to the server that if the transaction results in a delete, generate a tombstone for the deleted record.
`--send-key`	disabled	Enables the key policy AS_POLICY_KEY_SEND, which sends the key value in addition to the key digest.
`--sleep-between-retries`	0	Enables sleep between retries if a transaction fails and the timeout was not exceeded.
`-c` or `--async-max-commands COMMAND_COUNT`	50	Maximum number of concurrent asynchronous commands that are active at any time. Replace `COMMAND_COUNT` with the number of commands.
`-W` or `--event-loops THREAD_COUNT`	1	Number of event loops (or selector threads) when running in asynchronous mode. Replace `THREAD_COUNT` with the number of threads.

Namespace and record format options

Option	Default	Description
`-n` or `--namespace NAMESPACE_NAME`	test	Aerospike namespace to perform all operations under. Replace `NAMESPACE_NAME` with the name of the namespace.
`-s` or `--set SET_NAME`	testset	Aerospike set to perform all operations in. Replace `SET_NAME` with the name of the set.
`-b` or `--bin BIN_NAME`	testbin	Base name to use for bins. Replace `BIN_NAME` with the name you want to use. The first bin is named `BIN_NAME`, the second is `BIN_NAME`_2, the third `BIN_NAME`_3.
`-K` or `--start-key KEY_STARTING_VALUE`	0	Set the starting value of the working set of keys. Replace `KEY_STARTING_VALUE` with the starting value. If you are using an insert workload, `start-key` indicates the first value to write. Otherwise, `start-key` indicates the smallest value in the working set of keys.
`-k` or `--keys KEYS_COUNT`	1000000	Set the number of keys the client is dealing with. Replace `KEYS_COUNT` with the number of keys. If you are using an insert workload, the client writes this number of keys, starting from value = `start-key`. Otherwise, the client reads and updates randomly across the values between `start-key` and `start-key` + `num_keys`.
`-o` or `--object-spec OBJECT_TYPE`	I4	Set the bin specifications. Replace `OBJECT_TYPE` with a comma-separated list of bin specifications. See object spec for more details.
`--compression-ratio RATIO`	1	Sets the compression ratio for binary data. Replace `RATIO` with the desired ratio. This option causes the benchmark tool to generate binary data which is roughly compressed by this proportion. Note: this is only applied to `B<n>` binary data, not to any of the other types of record data.
`-e` or `--expiration-time`	0	Set the TTL of all records written in write transactions. Available options are: - `-1`: No TTL, never expire. - `-2`: Do not modify the record TTL with this write transaction. - `0`: Adopt the default TTL value from the namespace. - `>0`: TTL of the record in seconds.

Object spec

The object spec is a flexible way to describe how to structure records being written to the database. The object spec is a comma-separated list of bin specs. Each bin spec is one of the following:

Variable Scalars:

Type	Format	Description
Boolean	`b`	A random boolean bin/value.
Integer	`I<n>`	A random integer with the lower `n` bytes randomized (and the rest remaining 0). `n` can range from 1 to 8. Note: the `n`th byte is guaranteed to not be 0, except in the case `n=1`.
Double	`D`	A random double bin/value (8 bytes).
String	`S<n>`	A random string of length `n` of either lowercase letters `a-z` or numbers `0-9`.
Binary Data	`B<n>`	Random binary data of length `n` bytes. Note: if `--compression-ratio` is set, only the first `ratio * n` bytes are random. The rest are 0.

Constant Scalars:

Type	Format	Example	Notes
Const Boolean	`true`/`T` or `false`/`F`	true	The full-word forms are case-insensitive, but `T` and `F` must be capitalized.
Const Integer	A decimal, hex (0x...), or octal (0...) number	123
Const Double	A decimal number with a `.`	123.456	Const doubles may optionally be followed by "f" or "F", but must always contain a decimal.
Const String	A backslash-escaped string enclosed in double quotes	"this -> \" is a double quote\n"	Strings are backslash-escaped, but so are most terminals, so make sure to escape your backslashes with backslashes when writing object specs in a command line argument. Additionally, double quotes are often special characters, so escape those too.

Collection Bins:

Type	Format	Notes
List	`[BIN_SPEC,...]`	A list of one or more bin specs separated by commas.
Map	`{SCALAR_BIN_SPEC:BIN_SPEC,...}`	A list of one or more mappings from a scalar bin spec to a bin spec. Anything but a List or Map. These describe the key-value pairs that the Map contains.

Multipliers

Multipliers are positive integer constants, followed by an asterisk *, preceding a bin spec.

In the root-level object spec, multipliers indicate how many times to repeat a bin spec across separate bins. For example, the following object specs are equivalent:

I, I, I, I, S10, S10, S10         = 4*I, 3*S10
123, 123, 123, "string", "string" = 3*123, 2*"string"

In a list, multipliers indicate how many times to repeat a bin spec in the list. The following are equivalent:

[I, I, I, I, S10, S10, S10] = [4*I, 3*S10]

In a map, multipliers must precede variable scalar keys. Multipliers indicate how many unique key-value pairs of the given format to insert into the map. Multipliers may not precede const key bin specs or value bin specs in a key-value mapping. The following are equivalent:

{I:B10, I:B10, I:B10} = {3*I:B10}

Workloads

The benchmark tool uses workloads to interact with the Aerospike database. The workload types are:

I: Linear Insert. Runs over the range of keys specified and inserts a record with that key.
RU,READ_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update. Randomly picks keys, and either writes a record with that key or reads a record from database with that key, with probability according to the given read percentage.
- Replace READ_PCT with a number between 0 and 100. 0 means only do writes. 100 means only do reads.
- Starting with asbench 1.5 (Tools 7.1), you may optionally provide READ_PCT_ALL_BINS,WRITE_PCT_ALL_BINS to indicate the percentage of reads and writes that read the entire record. Otherwise only read the first bin. Default is 100.
RR,READ_PCT[,<READ_PCT_ALL_BINS[,REPLACE_PCT_ALL_BINS]]: Random Read/Replace/Function. Same as RU, except replaces record instead of updates.
RUF,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update/Function. Same as RU, except may also perform an apply command on the random key with a given UDF function.
- The percentage of operations that are function calls (UDFs) is 100 - READ_PCT - WRITE_PCT. This value must not be negative, which is checked for at initialization.
RUD,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update/Delete. Same as RU, except may also perform deletes on a random key.
- The percentage of operations that deletes is 100 - READ_PCT - WRITE_PCT. This value must not be negative, which is checked for at initialization.
DB: Delete bins. Same as I, but deletes the record with the given key from the database.
- In order for DB to delete entire records, it must delete every bin that the record contains. Since bin names are based on their position in the object spec, when running this workload, verify you are using the same object spec you used to generate the records being deleted.
- If you only want to delete a subset of bins, use write-bins to select which bins to delete. DB deletes all bins by default and uses write-bins to determine which bins to delete.

Workload options

Option	Default	Description
`--read-bins`	all bins	Specifies which bins from the object-spec to load from the database on read transactions. Must be given as a comma-separated list of increasing bin numbers, starting from 1.
`--write-bins`	all bins	Specifies which bins from the object-spec to generate and store in the database on write transactions. Must be given as a comma-separated list of bin numbers, starting from 1.
`-R` or `--random`	disabled	Use dynamically-generated random bin values for each write transaction instead of fixed values (one per thread) created at the beginning of the workload.
`-t` or `--duration SECONDS`	10 for RU,RUF workloads, 0 for I,DB workloads	Specifies the minimum amount of time the benchmark runs for. Replace `SECONDS` with the number of seconds. For random workloads with no finite amount of work needing to be done, this value must be above 0 for anything to happen. For workloads with a finite amount of work, like linear insertion/deletion, set this value to 0.
`-w` or `--workload WORKLOAD_TYPE`	RU,50	Desired workload. Replace `WORKLOAD_TYPE` with the workload type.
`--workload-stages PATH/TO/WORKLOAD_STAGES.yaml`	disabled	Accepts a path to a workload stages YAML file, which should contain a list of workload stages to run through. See workload stages.
`-g` or `--throughput TPS`	0	Throttle transactions per second to a maximum value. Replace `TPS` with transactions per second. If transactions per second is zero, throughput is not throttled.
`--batch-size SIZE`	1	Enable all batch modes with a number of records to process in each batch call. Replace `SIZE` with the number of records. Batch mode is applied to the read, write, and delete transactions in I, RU, RR, RUF, and RUD workloads. If batch size is 1, batch mode is disabled.
`--batch-read-size SIZE`	1	Enable batch read mode with a number of records to process in each batch read call. Replace `SIZE` with the number of records. Batch read mode is applied to the read transactions in RU, RR, RUF, and RUD workloads. If batch read size is 1, batch read mode is disabled. Batch read size takes precedence over batch size.
`--batch-write-size SIZE`	1	Enable batch write mode with a number of records to process in each batch write call. Replace `SIZE` with the number of records. Batch write mode is applied to the write transactions in I, RU, RUF, and RUD workloads. If batch write size is 1, batch write mode is disabled. Batch write size takes precedence over batch size.
`--batch-delete-size SIZE`	1	Enable batch delete mode with a number of records to process in each batch delete call. Replace `SIZE` with the number of records. Batch delete mode is applied to the delete transactions in RUD and BD workloads. If batch delete size is 1, batch delete mode is disabled. Batch delete size takes precedence over batch size.
`-a` or `--async`	disabled	Enable asynchronous mode, which uses the asynchronous variant of every Aerospike C Client method for transactions.

Workload stages

You can run multiple different workloads in sequence using the --workload-stages option with a workload stage configuration file, which is in YAML format. The configuration file should only be a list of workload stages in the following format:

- stage: 1
# required arguments
  workload: <workload type>
# optional arguments
  duration: <seconds>
  tps : max possible with 0 (default), or specified transactions per second
  object-spec: Object spec for the stage. Otherwise, inherits from the previous
      stage, with the first stage inheriting the global object spec.
  key-start: Key start, otherwise inheriting from the global context
  key-end: Key end, otherwise inheriting from the global context
  read-bins: Which bins to read if the workload includes reads
  write-bins: Which bins to write to if the workload includes writes
  pause: max number of seconds to pause before the stage starts. Waits a random
      number of seconds between 1 and the pause.
  async: when true/yes, uses asynchronous commands for this stage. Default is false
  random: when true/yes, randomly generates new objects for each write. Default is false
  batch-size: specifies the batch size for all batch transactions for this stage. Default is 1
  batch-read-size: specifies the batch size of reads for this stage. Takes precedence over batch-size. Default is 1
  batch-write-size: specifies the batch size of writes for this stage. Takes precedence over batch-size. Default is 1
  batch-delete-size: specifies the batch size of deletes for this stage. Takes precedence over batch-size. Default is 1
- stage: 2
  ...

Each stage must begin with stage: STAGE_NUMBER, where STAGE_NUMBER is the position of the stage in the list. The stages must appear in order.

When arguments say they inherit from the global context, the value they inherit either comes from a command line argument, or is the default value if no command line argument for that value was given.

Below is an example workload stages file, call it workload.yml.

- stage: 1
  duration: 60
  workload: RU,80
  object-spec: I2,S12,[3*I1]
- stage: 2
  duration: 30
  workload: I
  object-spec: {5*S1:I1}

To use workload.yaml with asbench, run the following.

asbench --workload-stages=workload.yml

Latency histograms

There are multiple ways to record latencies measured throughout a benchmark run. All latencies are recorded in microseconds.

Option	Default	Description
`--output-file`	stdout	Specifies an output file to write periodic latency data, which enables tracking of transaction latencies in microseconds in a histogram. Currently uses a default layout. The file is opened in append mode.
`-L` or `--latency`	disabled	Enables the periodic HDR histogram summary of latency data.
`--percentiles P_1[,P_2[,P_3...]]`	"50,90,99,99.9,99.99"	Specifies the latency percentiles to display in the periodic latency histogram.
`--output-period SECONDS`	1	Specifies the period between successive snapshots of the periodic latency histogram. Replace `SECONDS` with the period in seconds.
`--hdr-hist PATH_TO_OUTPUT`	disabled	Enables the cumulative HDR histogram and specifies the directory to dump the cumulative HDR histogram summary.

Periodic latency histogram

Periodic latency data is stored in the --output-file specified and recorded in histograms with three ranges of fixed bucket sizes. The three ranges are not configurable. There is one histogram for reads, one for writes, and one for UDF calls.

The three ranges are:

100us to 4000us, bucket width 100us
4000us to 64000us, bucket width 1000us
64000us to 128000us, bucket width 4000us

Format of the histogram output file:

HIST_NAME UTC_TIME, PERIOD_TIME, TOTAL_LATENCIES, BUCKET_1_LOWER_BOUND:BUCKET_1_LATENCIES, ...

HIST_NAME: Name of the histogram, either read_hist, write_hist, or udf_hist.
UTC_TIME: UTC time of the end of the interval.
PERIOD_TIME: Length of the interval in seconds.
TOTAL_LATENCIES : Total number of transaction latencies recorded in the interval.
BUCKET_1_LOWER_BOUND: Each bucket with at least one latency recorded is displayed in ascended number of interval lower bounds.
BUCKET_1_LATENCIES: Number of transaction latencies falling within the bucket's range.

HDR histogram

Transaction latencies can also be recorded in an HDR histogram. There is one HDR histogram for reads, one for writes, and one for UDF calls.

Use one of the following to enable HDR histograms:

--latency: Displays select percentiles from the HDR histograms every output-period seconds.
- The percentiles that are printed when --latency is enabled can be configured with --percentiles followed by a comma-separated list of percentiles. This list must be in ascending order. No percentile can be less than 0, or greater than or equal to 100.
--hdr-hist: Writes the full HDR histograms to the given directory in both a human-readable text format (.txt) and a binary encoding of the HDR histogram (.hdrhist).

UDFs

UDF calls are made in RUF (read/update/function) workloads, being the "function" part of that workload. A key is chosen at random from the range of keys given, and an Aerospike apply call is made on that key with the given UDF function (--udf-function-name) from the given UDF package (--udf-package-name). Optionally, --udf-function-values may be supplied, which takes an object spec and randomly generates arguments every call.

note

The UDF function arguments follow the same rules as the object spec used on records. They randomly generate for every call only if --random is supplied as an argument.

Option	Default	Description
`-upn` or `--udf-package-name PACKAGE_NAME`	-	The package name for the UDF to be called. Replace `PACKAGE_NAME` with the package name.
`-ufn` or `--udf-function-name FUNCTION_NAME`	-	Name of the UDF function in the package to be called. Replace `FUNCTION_NAME` with the function name.
`-ufv` or `--udf-function-values FUNCTION_VALUES`	none	Arguments to be passed to the UDF when called, which are given as an object spec (see object spec). Replace `FUNCTION_VALUES` with the arguments.

Usage​

Connection options​

TLS options​

Global options​

Namespace and record format options​

Object spec​

Multipliers​

Workloads​

Workload options​

Workload stages​

Latency histograms​

Periodic latency histogram​

Format of the histogram output file:​

HDR histogram​

UDFs​