Aerospike is pleased to announce Aerospike 5.6, which is now available to customers. This is a developer-focused release offering major new functionality, including several features requested by the Aerospike user community.
As part of this release we are also announcing that Aerospike Connect for Spark now supports the Data Source API V2, and is a direct beneficiary of the innovation in Aerospike 5.6. See this blog for additional information.
Set Indexes
Aerospike 5.6 improves performance of scanning sets that are small relative to the number of records in its namespace. This is accomplished by using the new enable-index
configuration parameter, which defaults to false
. This is a dynamic parameter that can be enabled or disabled on a per-set basis, either with the asinfo
utility, or through the set
sub-context clause of the server configuration file:
namespace test {
...
set
<set-name> {
enable-index true
}
}
When enabled, a separate index is created and maintained for the records in the specified set(s), allowing quick access when a scan on that set is initiated. A workaround previously used by developers is to create an extra bin with a secondary index. These workarounds can now be removed in most cases.
Along with this change, two existing set
subcontext configuration parameters have been renamed:
Old Name | New Name |
|
|
|
|
Answering the question of whether to enable set indexes will require some experimentation from developers. Preliminary guidelines based on tests run against a namespace of 2 billion records show an overwhelming advantage for sets containing up to a million records. As the set size approaches that of the containing namespace the advantage of a separate index diminishes and eventually disappears. Other factors influencing set index performance include whether the data is in memory or on SSD, and the number of threads performing the scans.
Aerospike Expression Enhancements
Aerospike Expressions are a domain-specific language based on Polish notation that can reference and manipulate record bin data and metadata. They were introduced in 5.2 and initially supported filtering on single record operations. In 5.3 this was extended to allow Expressions to filter records shipped to XDR destinations.
Aerospike 5.6 introduces Operation Expressions. These are used in bin operations to compute a value from information within the record. The value can either be returned to the client with read expressions, or written to a specified bin with write expressions. This enables atomic, cross-bin operations which were previously only available through UDFs, avoiding the overhead of Lua. Operation Expressions are used the same way as bin-operations in write, read, background scan, and background query operations.
Aerospike 5.6 further extends the capabilities of Expressions through the addition of the following language elements:
Arithmetic operators:
as_exp_add()
,as_exp_sub()
,as_exp_mul()
, andas_exp_div()
.Variable assignment:
as_exp_let()
defines a scope within which variables can be defined withas_exp_def()
and referenced withas_exp_var()
.Conditional evaluation:
as_exp_cond()
allows multiplecondition-action tuples to be defined, followed by a default action. Each tuple is evaluated in order: if the condition is true, the action is taken and subsequent tuples are not evaluated. If none of the tuple conditions evaluate to true, an optional default action is taken.
Quotas
Aerospike 5.6 Enterprise Edition allows per-user quotas to be independently specified for read and write operations. This is new functionality to support policies that manage the allocation of server resources in scenarios when more than one application is running. The key improvement afforded by quotas is the ability to prevent applications from “hogging” resources and thereby adversely impacting other applications.
Quotas are enabled through the static enable-quotas
configuration parameter within the security configuration context. This parameter by itself does not apply quotas: it merely allows them to be set. Note that although enablement is static, the actual per-user quotas are dynamic.
security {
enable-quotas true
tps-weight <2 ≤ w ≤ 20>
}
Read and write quotas are specified in terms of records per second (RPS). This is done via extensions to the existing Aerospike user permission scheme: quotas are assigned to roles that are assigned to users. Version 2.2 of the asadm utility will support setting quotas. All record accesses are counted towards the applicable quotas: updates, replaces, UDFs, background UDFs, reads, batch reads and scans. Operate commands count toward the appropriate quota, depending on whether the transaction goes down the read or write path.
Quotas can be configured to be more or less sensitive to being triggered by transient variations in RPS by setting the tps-weight
configuration parameter. The allowable range is from 2 (highest sensitivity: RPS over the most recent second is given equal weight to previous RPS value) to 20 (RPS over the most recent second gets 1/19th the weight of previous RPS value). The default for tps-weight
is 2.
If a quota is exceeded on single-record operation, it will fail. Scans and background UDF/Operate transactions are handled slightly differently. For those, the quota is specified through a client policy argument, and checked up front. If the operation would put the user over quota, it will not be attempted and an error will be returned. During batch operations, sub-transactions are treated just like individual reads. Applications must check the return codes and implement their own rate-limiting policy, as there are no server mechanisms for throttling or retries.
Minor Features
Aerospike 5.6 introduces several minor features, the most notable of which are described below. As always, refer to the 5.6 release notes for complete details and restrictions.
Boolean Bin Type
Aerospike 5.6 supports a new Boolean type for record bins. Previously Booleans were stored as integers constrained to the values {0, 1}. For computing storage requirements, a Boolean is represented internally as an 8-bit quantity.
New Statistics
New statistics have been added to display the actual storage used for indexes in Aerospike Database Enterprise Edition all-flash configurations:
index_flash_alloc_bytes
: the number of bytes in the set of 4K blocks containing any part of an index (i.e. always a multiple of 4K)index_flash_alloc_pct
: the percentage of 4K blocks used for flash indexes out of the total blocks (in use or free) existing across the mount point(s) identified as containing indexes
These are similar to the existing statistics index_flash_used_bytes
and index_flash_used_pct
, the difference being the new stats represent entire 4k chunks which have at least one element.
The new stats will also appear in the index-flash-usage
log ticker line as shown in this example:
{namespace} index-flash-usage: used-bytes 320000 used-pct 23 \
alloc-bytes 16384000 alloc-pct 92
There are also new file descriptor statistics for client, heartbeat, and fabric connections. Within each class the number of opened and closed connections are reported:
client_connections_opened
client_connections_closed
heartbeat_connections_opened
heartbeat_connections_closed
fabric_connections_opened
fabric_connections_closed
Security Enhancements
Aerospike Database 5.6 has added new configuration parameters to enable the logging of data operations performed by all users having a given role, or by a given user.
security {
[log | syslog] {
report-data-op-role <role-name>
report-data-op-user <user-name>
}
}
This allows an audit trail to be maintained in the event forensics are needed to trace questionable server activity to specific users or roles.
Another security enhancement is that the show users command in the Aerospike info utility now lists the number of per-user open connections.