Aerospike Benchmark (asbench)
The Aerospike Benchmark tool is a C-based tool that measures the performance of an Aerospike cluster. It can mimic real-world workloads with configurable record structures, access patterns, and UDF calls.
You can use any port between 1024 and 65535 for Aerospike, as long as the port is not in use by an existing process.
Usageโ
The --help
option of asbench
gives an overview of all supported command line options.
The -V
or --version
option prints the current version of asbench.
asbench --help
Connection optionsโ
Option | Default | Description |
---|---|---|
-h or --hosts HOST_1:TLSNAME_1:PORT_1[,...] | 127.0.0.1 | List of seed hosts. TLSNAME is only used when connecting with a secure TLS-enabled server. If the port is not specified, the default port is used. IPv6 addresses must be enclosed in square brackets. |
-p or --port PORT | 3000 | Default port on which to connect to Aerospike. |
-U or --user USER | User name. This is mandatory if security is enabled on the server. | |
-P | User's password for Aerospike servers that require authentication. The system responds by asking you to enter the password. | |
--auth MODE | INTERNAL | Set the authentication mode when user/password is defined. Replace MODE with one of the following: INTERNAL , EXTERNAL , EXTERNAL_INSECURE , or PKI . This mode must be set to EXTERNAL when using LDAP. |
-tls or --tls-enable | disabled | Use TLS/SSL sockets. |
--services-alternate | false | Use to connect to alternate-access-address when the cluster nodes publish IP addresses through access-address which are not accessible over WAN, or to connect to alternate IP addresses accessible over WAN through alternate-access-address . |
TLS optionsโ
Option | Default | Description |
---|---|---|
--tls-cafile=TLS_CAFILE | Path to a trusted CA certificate file. | |
--tls-capath=TLS_CAPATH | Path to a directory of trusted CA certificates. | |
--tls-name=TLS_NAME | Default TLS name used to authenticate each TLS socket connection. This must match the cluster name. | |
--tls-protocols=TLS_PROTOCOLS | -all +TLSv1.2 if the connection supports TLSv1.2. -all +TLSv1 if it doesn't. | TLS protocol selection criteria. Uses Apache's SSLProtocol format. |
--tls-cipher-suite=TLS_CIPHER_SUITE | TLS cipher selection criteria. Uses Open_SSL's Cipher List Format. | |
--tls-keyfile=TLS_KEYFILE | Path to the key for mutual authentication, if the Aerospike cluster supports it. | |
--tls-keyfile-password=TLS_KEYFILE_PASSWORD | Password to load a protected TLS keyfile. Replace TLS_KEYFILE_PASSWORD with one of the following: - An environment variable env:VAR .- The path to a file file:PATH . - A string: PASSWORD . If --tls-keyfile-password is specified and no password is provided, the system responds by asking you to enter the user's password. | |
--tls-certfile=TLS_CERTFILE PATH | Path to the chain file for mutual authentication, if the Aerospike cluster supports it). | |
--tls-cert-blacklist PATH | Path to a certificate blacklist file. The file should contain one line for each blacklisted certificate. Each line starts with the certificate serial number expressed in hexadecimal format. Serial numbers are only required to be unique per issuer. Each entry may optionally specify the issuer name of the certificate. Example:86EC7A484 /C=US/ST=CA/O=Acme/OU=Eng/CN=TestChainCA | |
--tls-crl-check | Enable CRL checking for leaf certificate. An error occurs if a valid CRL file cannot be found in tls_capath . | |
--tls-crl-check-all | Enable CRL checking for the entire certificate chain. An error occurs if a valid CRL file cannot be found in tls_capath . | |
--tls-log-session-info | Log TLS connected session info. | |
--tls-login-only | Use TLS for node login only. |
The TLS name is only used when connecting with a secure TLS-enabled server.
The following example runs the default benchmark on a cluster of nodes 1.2.3.4
and 5.6.7.8
using:
- The default Aerospike port of
3000
. - TLS configured.
- Namespace
TEST
.
asbench --hosts 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace TEST --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem
Global optionsโ
Option | Default | Description |
---|---|---|
-z or --threads THREAD_COUNT | 16 | Number of threads used to perform synchronous read/write commands. Replace THREAD_COUNT with the number of threads. |
--compress | disabled | Enable binary data compression through the Aerospike client. Internally, this sets the compression policy to true. |
--socket-timeout MS | 30000 | Read/Write socket timeout in milliseconds. Replace MS with the number of milliseconds. |
--read-socket-timeout MS | 30000 | Read socket timeout in milliseconds. Replace MS with the number of milliseconds. |
--write-socket-timeout MS | 30000 | Write socket timeout in milliseconds. Replace MS with the number of milliseconds. |
-T or --timeout MS | 0 | Read/Write total timeout in milliseconds. Replace MS with the number of milliseconds. |
--read-timeout MS | 0 | Read total timeout in milliseconds. Replace MS with the number of milliseconds. |
--write-timeout MS | 0 | Write total timeout in milliseconds. Replace MS with the number of milliseconds. |
--max-retries RETRIES_COUNT | 1 | Maximum number of retries before aborting the current transaction. Replace RETRIES_COUNT with the number of retries. |
-d or --debug | disabled | Run benchmark in debug mode. |
-S or --shared | disabled | Use shared memory cluster tending. |
-C or --replica REPLICA_TYPE | master | Which replica to use for reads. Replace REPLICA_TYPE with one of the following:- master : Always use the node containing the master partition.- any : Distribute reads across master and proles in round-robin fashion.- sequence : Always try master first. If master fails, try proles in sequence. - preferRack : Always try node on the same rack as the benchmark first. If no nodes on the same rack, use sequence. This option requires you to set rack-id . |
--rack-id RACK_ID | Which rack this instance of the asbench resides on. Required with replica policy prefer-rack . Replace RACK_ID with the rack's ID. | |
-N or --read-mode-ap READ_MODE | one | Read mode for AP (availability) namespaces. Replace READ_MODE with either one or all . |
-B or --read-mode-sc SC_READ_MODE | session | Read mode for SC (strong consistency) namespaces. Replace SC_READ_MODE with one of the following: session , linearize , allowReplica , or allowUnavailable . |
-M or --commit-level LEVEL | all | Write commit guarantee level. Replace LEVEL with either all or master . |
-Y or --conn-pools-per-node POOLS_COUNT | 1 | Number of connection pools per node. Replace POOLS_COUNT with the number of connection pools. |
-D or --durable-delete | disabled | All transactions set the durable-delete flag which indicates to the server that if the transaction results in a delete, generate a tombstone for the deleted record. |
--send-key | disabled | Enables the key policy AS_POLICY_KEY_SEND, which sends the key value in addition to the key digest. |
--sleep-between-retries | 0 | Enables sleep between retries if a transaction fails and the timeout was not exceeded. |
-c or --async-max-commands COMMAND_COUNT | 50 | Maximum number of concurrent asynchronous commands that are active at any time. Replace COMMAND_COUNT with the number of commands. |
-W or --event-loops THREAD_COUNT | 1 | Number of event loops (or selector threads) when running in asynchronous mode. Replace THREAD_COUNT with the number of threads. |
Namespace and record format optionsโ
Option | Default | Description |
---|---|---|
-n or --namespace NAMESPACE_NAME | test | Aerospike namespace to perform all operations under. Replace NAMESPACE_NAME with the name of the namespace. |
-s or --set SET_NAME | testset | Aerospike set to perform all operations in. Replace SET_NAME with the name of the set. |
-b or --bin BIN_NAME | testbin | Base name to use for bins. Replace BIN_NAME with the name you want to use. The first bin is named BIN_NAME , the second is BIN_NAME _2, the third BIN_NAME _3. |
-K or --start-key KEY_STARTING_VALUE | 0 | Set the starting value of the working set of keys. Replace KEY_STARTING_VALUE with the starting value. If you are using an insert workload, start-key indicates the first value to write. Otherwise, start-key indicates the smallest value in the working set of keys. |
-k or --keys KEYS_COUNT | 1000000 | Set the number of keys the client is dealing with. Replace KEYS_COUNT with the number of keys. If you are using an insert workload, the client writes this number of keys, starting from value = start-key . Otherwise, the client reads and updates randomly across the values between start-key and start-key + num_keys . |
-o or --object-spec OBJECT_TYPE | I4 | Set the bin specifications. Replace OBJECT_TYPE with a comma-separated list of bin specifications. See object spec for more details. |
--compression-ratio RATIO | 1 | Sets the compression ratio for binary data. Replace RATIO with the desired ratio. This option causes the benchmark tool to generate binary data which is roughly compressed by this proportion. Note: this is only applied to B<n> binary data, not to any of the other types of record data. |
-e or --expiration-time | 0 | Set the TTL of all records written in write transactions. Available options are: - -1 : No TTL, never expire.- -2 : Do not modify the record TTL with this write transaction.- 0 : Adopt the default TTL value from the namespace.- >0 : TTL of the record in seconds. |
Object specโ
The object spec is a flexible way to describe how to structure records being written to the database. The object spec is a comma-separated list of bin specs. Each bin spec is one of the following:
Variable Scalars:
Type | Format | Description |
---|---|---|
Boolean | b | A random boolean bin/value. |
Integer | I<n> | A random integer with the lower n bytes randomized (and the rest remaining 0). n can range from 1 to 8.Note: the n th byte is guaranteed to not be 0, except in the case n=1 . |
Double | D | A random double bin/value (8 bytes). |
String | S<n> | A random string of length n of either lowercase letters a-z or numbers 0-9 . |
Binary Data | B<n> | Random binary data of length n bytes.Note: if --compression-ratio is set, only the first ratio * n bytes are random. The rest are 0. |
Constant Scalars:
Type | Format | Example | Notes |
---|---|---|---|
Const Boolean | true /T or false /F | true | The full-word forms are case-insensitive, but T and F must be capitalized. |
Const Integer | A decimal, hex (0x...), or octal (0...) number | 123 | |
Const Double | A decimal number with a . | 123.456 | Const doubles may optionally be followed by "f" or "F", but must always contain a decimal. |
Const String | A backslash-escaped string enclosed in double quotes | "this -> \" is a double quote\n" | Strings are backslash-escaped, but so are most terminals, so make sure to escape your backslashes with backslashes when writing object specs in a command line argument. Additionally, double quotes are often special characters, so escape those too. |
Collection Bins:
Type | Format | Notes |
---|---|---|
List | [BIN_SPEC,...] | A list of one or more bin specs separated by commas. |
Map | {SCALAR_BIN_SPEC:BIN_SPEC,...} | A list of one or more mappings from a scalar bin spec to a bin spec. Anything but a List or Map. These describe the key-value pairs that the Map contains. |
Multipliersโ
Multipliers are positive integer constants, followed by an asterisk *
, preceding a bin spec.
In the root-level object spec, multipliers indicate how many times to repeat a bin spec across separate bins. For example, the following object specs are equivalent:
I, I, I, I, S10, S10, S10 = 4*I, 3*S10
123, 123, 123, "string", "string" = 3*123, 2*"string"
In a list, multipliers indicate how many times to repeat a bin spec in the list. The following are equivalent:
[I, I, I, I, S10, S10, S10] = [4*I, 3*S10]
In a map, multipliers must precede variable scalar keys. Multipliers indicate how many unique key-value pairs of the given format to insert into the map. Multipliers may not precede const key bin specs or value bin specs in a key-value mapping. The following are equivalent:
{I:B10, I:B10, I:B10} = {3*I:B10}
Workloadsโ
The benchmark tool uses workloads to interact with the Aerospike database. The workload types are:
I
: Linear Insert. Runs over the range of keys specified and inserts a record with that key.RU,READ_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]
: Random Read/Update. Randomly picks keys, and either writes a record with that key or reads a record from database with that key, with probability according to the given read percentage.- Replace
READ_PCT
with a number between 0 and 100. 0 means only do writes. 100 means only do reads. - Starting with asbench 1.5 (Tools 7.1), you may optionally provide
READ_PCT_ALL_BINS
,WRITE_PCT_ALL_BINS
to indicate the percentage of reads and writes that read the entire record. Otherwise only read the first bin. Default is 100.
- Replace
RR,READ_PCT[,<READ_PCT_ALL_BINS[,REPLACE_PCT_ALL_BINS]]
: Random Read/Replace/Function. Same asRU
, except replaces record instead of updates.RUF,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]
: Random Read/Update/Function. Same asRU
, except may also perform an apply command on the random key with a given UDF function.- The percentage of operations that are function calls (UDFs) is
100 - READ_PCT - WRITE_PCT
. This value must not be negative, which is checked for at initialization.
- The percentage of operations that are function calls (UDFs) is
RUD,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]
: Random Read/Update/Delete. Same asRU
, except may also perform deletes on a random key.- The percentage of operations that deletes is
100 - READ_PCT - WRITE_PCT
. This value must not be negative, which is checked for at initialization.
- The percentage of operations that deletes is
DB
: Delete bins. Same asI
, but deletes the record with the given key from the database.- In order for
DB
to delete entire records, it must delete every bin that the record contains. Since bin names are based on their position in the object spec, when running this workload, verify you are using the same object spec you used to generate the records being deleted. - If you only want to delete a subset of bins, use
write-bins
to select which bins to delete.DB
deletes all bins by default and uses write-bins to determine which bins to delete.
- In order for
Workload optionsโ
Option | Default | Description |
---|---|---|
--read-bins | all bins | Specifies which bins from the object-spec to load from the database on read transactions. Must be given as a comma-separated list of increasing bin numbers, starting from 1. |
--write-bins | all bins | Specifies which bins from the object-spec to generate and store in the database on write transactions. Must be given as a comma-separated list of bin numbers, starting from 1. |
-R or --random | disabled | Use dynamically-generated random bin values for each write transaction instead of fixed values (one per thread) created at the beginning of the workload. |
-t or --duration SECONDS | 10 for RU,RUF workloads, 0 for I,DB workloads | Specifies the minimum amount of time the benchmark runs for. Replace SECONDS with the number of seconds. For random workloads with no finite amount of work needing to be done, this value must be above 0 for anything to happen. For workloads with a finite amount of work, like linear insertion/deletion, set this value to 0. |
-w or --workload WORKLOAD_TYPE | RU,50 | Desired workload. Replace WORKLOAD_TYPE with the workload type. |
--workload-stages PATH/TO/WORKLOAD_STAGES.yaml | disabled | Accepts a path to a workload stages YAML file, which should contain a list of workload stages to run through. See workload stages. |
-g or --throughput TPS | 0 | Throttle transactions per second to a maximum value. Replace TPS with transactions per second. If transactions per second is zero, throughput is not throttled. |
--batch-size SIZE | 1 | Enable all batch modes with a number of records to process in each batch call. Replace SIZE with the number of records. Batch mode is applied to the read, write, and delete transactions in I, RU, RR, RUF, and RUD workloads. If batch size is 1, batch mode is disabled. |
--batch-read-size SIZE | 1 | Enable batch read mode with a number of records to process in each batch read call. Replace SIZE with the number of records. Batch read mode is applied to the read transactions in RU, RR, RUF, and RUD workloads. If batch read size is 1, batch read mode is disabled. Batch read size takes precedence over batch size. |
--batch-write-size SIZE | 1 | Enable batch write mode with a number of records to process in each batch write call. Replace SIZE with the number of records. Batch write mode is applied to the write transactions in I, RU, RUF, and RUD workloads. If batch write size is 1, batch write mode is disabled. Batch write size takes precedence over batch size. |
--batch-delete-size SIZE | 1 | Enable batch delete mode with a number of records to process in each batch delete call. Replace SIZE with the number of records. Batch delete mode is applied to the delete transactions in RUD and BD workloads. If batch delete size is 1, batch delete mode is disabled. Batch delete size takes precedence over batch size. |
-a or --async | disabled | Enable asynchronous mode, which uses the asynchronous variant of every Aerospike C Client method for transactions. |
Workload stagesโ
You can run multiple different workloads in sequence using the --workload-stages
option with a workload stage configuration file, which is in YAML format. The configuration file should only be a list of workload stages in the following format:
- stage: 1
# required arguments
workload: <workload type>
# optional arguments
duration: <seconds>
tps : max possible with 0 (default), or specified transactions per second
object-spec: Object spec for the stage. Otherwise, inherits from the previous
stage, with the first stage inheriting the global object spec.
key-start: Key start, otherwise inheriting from the global context
key-end: Key end, otherwise inheriting from the global context
read-bins: Which bins to read if the workload includes reads
write-bins: Which bins to write to if the workload includes writes
pause: max number of seconds to pause before the stage starts. Waits a random
number of seconds between 1 and the pause.
async: when true/yes, uses asynchronous commands for this stage. Default is false
random: when true/yes, randomly generates new objects for each write. Default is false
batch-size: specifies the batch size for all batch transactions for this stage. Default is 1
batch-read-size: specifies the batch size of reads for this stage. Takes precedence over batch-size. Default is 1
batch-write-size: specifies the batch size of writes for this stage. Takes precedence over batch-size. Default is 1
batch-delete-size: specifies the batch size of deletes for this stage. Takes precedence over batch-size. Default is 1
- stage: 2
...
Each stage must begin with stage: STAGE_NUMBER
, where STAGE_NUMBER
is the position of the stage in the list. The stages must appear in order.
When arguments say they inherit from the global context, the value they inherit either comes from a command line argument, or is the default value if no command line argument for that value was given.
Below is an example workload stages file, call it workload.yml.
- stage: 1
duration: 60
workload: RU,80
object-spec: I2,S12,[3*I1]
- stage: 2
duration: 30
workload: I
object-spec: {5*S1:I1}
To use workload.yaml with asbench
, run the following.
asbench --workload-stages=workload.yml
Latency histogramsโ
There are multiple ways to record latencies measured throughout a benchmark run. All latencies are recorded in microseconds.
Option | Default | Description |
---|---|---|
--output-file | stdout | Specifies an output file to write periodic latency data, which enables tracking of transaction latencies in microseconds in a histogram. Currently uses a default layout. The file is opened in append mode. |
-L or --latency | disabled | Enables the periodic HDR histogram summary of latency data. |
--percentiles P_1[,P_2[,P_3...]] | "50,90,99,99.9,99.99" | Specifies the latency percentiles to display in the periodic latency histogram. |
--output-period SECONDS | 1 | Specifies the period between successive snapshots of the periodic latency histogram. Replace SECONDS with the period in seconds. |
--hdr-hist PATH_TO_OUTPUT | disabled | Enables the cumulative HDR histogram and specifies the directory to dump the cumulative HDR histogram summary. |
Periodic latency histogramโ
Periodic latency data is stored in the --output-file
specified and recorded in histograms with three ranges of fixed bucket sizes. The three ranges are not configurable. There is one histogram for reads, one for writes, and one for UDF calls.
The three ranges are:
- 100us to 4000us, bucket width 100us
- 4000us to 64000us, bucket width 1000us
- 64000us to 128000us, bucket width 4000us
Format of the histogram output file:โ
HIST_NAME UTC_TIME, PERIOD_TIME, TOTAL_LATENCIES, BUCKET_1_LOWER_BOUND:BUCKET_1_LATENCIES, ...
HIST_NAME
: Name of the histogram, eitherread_hist
,write_hist
, orudf_hist
.UTC_TIME
: UTC time of the end of the interval.PERIOD_TIME
: Length of the interval in seconds.TOTAL_LATENCIES
: Total number of transaction latencies recorded in the interval.BUCKET_1_LOWER_BOUND
: Each bucket with at least one latency recorded is displayed in ascended number of interval lower bounds.BUCKET_1_LATENCIES
: Number of transaction latencies falling within the bucket's range.
HDR histogramโ
Transaction latencies can also be recorded in an HDR histogram. There is one HDR histogram for reads, one for writes, and one for UDF calls.
Use one of the following to enable HDR histograms:
--latency
: Displays select percentiles from the HDR histograms everyoutput-period
seconds.- The percentiles that are printed when
--latency
is enabled can be configured with--percentiles
followed by a comma-separated list of percentiles. This list must be in ascending order. No percentile can be less than 0, or greater than or equal to 100.
- The percentiles that are printed when
--hdr-hist
: Writes the full HDR histograms to the given directory in both a human-readable text format (.txt) and a binary encoding of the HDR histogram (.hdrhist).
UDFsโ
UDF calls are made in RUF (read/update/function) workloads, being the "function" part of that workload. A key is chosen at random from the range of keys given, and an Aerospike apply call is made on that key with the given UDF function (--udf-function-name
) from the given UDF package (--udf-package-name
). Optionally, --udf-function-values
may be supplied, which takes an object spec and randomly generates arguments every call.
The UDF function arguments follow the same rules as the object spec used on records. They randomly generate for every call only if --random
is supplied as an argument.
Option | Default | Description |
---|---|---|
-upn or --udf-package-name PACKAGE_NAME | - | The package name for the UDF to be called. Replace PACKAGE_NAME with the package name. |
-ufn or --udf-function-name FUNCTION_NAME | - | Name of the UDF function in the package to be called. Replace FUNCTION_NAME with the function name. |
-ufv or --udf-function-values FUNCTION_VALUES | none | Arguments to be passed to the UDF when called, which are given as an object spec (see object spec). Replace FUNCTION_VALUES with the arguments. |