Aerospike Database 5.6: Developer Cornucopia

Paul Jensen - Vice President, Engineering Operations Blog, Technology, Developer

Aerospike is pleased to announce Aerospike 5.6, which is now available to customers.  This is a developer-focused release offering major new functionality, including several features requested by the Aerospike user community. 

As part of this release we are also announcing that Aerospike Connect for Spark now supports the Data Source API V2, and is a direct beneficiary of the innovation in Aerospike 5.6.  See this blog for additional information.

Set Indexes

Aerospike 5.6 improves performance of scanning sets that are small relative to the number of records in its namespace.  This is accomplished by using the new enable-index configuration parameter, which defaults to false.  This is a dynamic parameter that can be enabled or disabled on a per-set basis, either with the asinfo utility, or through the set sub-context clause of the server configuration file:

namespace test {
    ...
    set <set-name> {
        enable-index true
    }
}

When enabled, a separate index is created and maintained for the records in the specified set(s), allowing quick access when a scan on that set is initiated.  A workaround previously used by developers is to create an extra bin with a secondary index.  These workarounds can now be removed in most cases.

Along with this change, two existing set subcontext configuration parameters have been renamed:

Old NameNew Name
set-disable-evictiondisable-eviction
set-stop-writes-countstop-writes-count

Answering the question of whether to enable set indexes will require some experimentation from developers.  Preliminary guidelines based on tests run against a namespace of 2 billion records show an overwhelming advantage for sets containing up to a million records.  As the set size approaches that of the containing namespace the advantage of a separate index diminishes and eventually disappears.  Other factors influencing set index performance include whether the data is in memory or on SSD, and the number of threads performing the scans.

Aerospike Expression Enhancements

Aerospike Expressions are a domain-specific language based on Polish notation that can reference and manipulate record bin data and metadata.   They were introduced in 5.2 and initially supported filtering on single record operations.  In 5.3 this was extended to allow Expressions to filter records shipped to XDR destinations.  

Aerospike 5.6 introduces Operation Expressions.  These are used in bin operations to compute a value from information within the record. The value can either be returned to the client with read expressions, or written to a specified bin with write expressions.  This enables atomic, cross-bin operations which were previously only available through UDFs, avoiding the overhead of Lua.  Operation Expressions are used the same way as bin-operations in write, read, background scan, and background query operations.

Aerospike 5.6 further extends the capabilities of Expressions through the addition of the following language elements:

  • Arithmetic operators: as_exp_add(), as_exp_sub(), as_exp_mul(), and as_exp_div().
  • Variable assignment: as_exp_let() defines a scope within which variables can be defined with as_exp_def() and referenced with as_exp_var().
  • Conditional evaluation: as_exp_cond() allows multiple condition-action tuples to be defined, followed by a default action.  Each tuple is evaluated in order: if the condition is true, the action is taken and subsequent tuples are not evaluated.  If none of the tuple conditions evaluate to true, an optional default action is taken.

Quotas

Aerospike 5.6 Enterprise Edition allows per-user quotas to be independently specified for read and write operations.  This is new functionality to support policies that manage the allocation of server resources in scenarios when more than one application is running.  The key improvement afforded by quotas is the ability to prevent applications from “hogging” resources and thereby adversely impacting other applications.

Quotas are enabled through the static enable-quotas configuration parameter within the security configuration context.  This parameter by itself does not apply quotas: it merely allows them to be set.  Note that although enablement is static, the actual per-user quotas are dynamic.

security {
    enable-quotas true
    tps-weight <2 ≤ w ≤ 20>
}

Read and write quotas are specified in terms of records per second (RPS). This is done via extensions to the existing Aerospike user permission scheme: quotas are assigned to roles that are assigned to users.  Version 2.2 of the asadm utility will support setting quotas.   All record accesses are counted towards the applicable quotas: updates, replaces, UDFs, background UDFs, reads, batch reads and scans.  Operate commands count toward the appropriate quota, depending on whether the transaction goes down the read or write path.  

Quotas can be configured to be more or less sensitive to being triggered by transient variations in RPS by setting the tps-weight configuration parameter.  The allowable range is from 2 (highest sensitivity: RPS over the most recent second is given equal weight to previous RPS value) to 20 (RPS over the most recent second gets 1/19th the weight of previous RPS value).  The default for tps-weight is 2.

If a quota is exceeded on single-record operation, it will fail.  Scans and background UDF/Operate transactions are handled slightly differently.  For those, the quota is specified through a client policy argument, and checked up front.  If the operation would put the user over quota, it will not be attempted and an error will be returned.  During batch operations,  sub-transactions are treated just like individual reads.  Applications must check the return codes and implement their own rate-limiting policy, as there are no server mechanisms for throttling or retries.

Minor Features

Aerospike 5.6 introduces several minor features, the most notable of which are described below.   As always, refer to the 5.6 release notes for complete details and restrictions.

Boolean Bin Type

Aerospike 5.6 supports a new Boolean type for record bins.  Previously Booleans were stored as integers constrained to the values {0, 1}.  For computing storage requirements, a Boolean is represented internally as an 8-bit quantity.

New Statistics

New statistics have been added to display the actual storage used for indexes in Aerospike Database Enterprise Edition all-flash configurations:

  • index_flash_alloc_bytes: the number of bytes in the set of 4K blocks containing any part of an index (i.e. always a multiple of 4K)
  • index_flash_alloc_pct: the percentage of 4K blocks used for flash indexes out of the total blocks (in use or free) existing across the mount point(s) identified as containing indexes

These are similar to the existing statistics index_flash_used_bytes and index_flash_used_pct, the difference being the new stats represent entire 4k chunks which have at least one element.

The new stats will also appear in the index-flash-usage log ticker line as shown in this example:

    {namespace} index-flash-usage: used-bytes 320000 used-pct 23 \       
  alloc-bytes 16384000 alloc-pct 92

There are also new file descriptor statistics for client, heartbeat, and fabric connections.  Within each class the number of opened and closed connections are reported:

  • client_connections_opened
  • client_connections_closed
  • heartbeat_connections_opened
  • heartbeat_connections_closed
  • fabric_connections_opened
  • fabric_connections_closed

Security Enhancements

Aerospike Database 5.6 has added new configuration parameters to enable the logging of data operations performed by all users having a given role, or by a given user.

security {
    [log | syslog] {
        report-data-op-role <role-name>
        report-data-op-user <user-name>
    }

}

This allows an audit trail to be maintained in the event forensics are needed to trace questionable server activity to specific users or roles.

Another security enhancement is that the show users command in the Aerospike info utility now lists the number of per-user open connections.

Share:

About Author

mm

    Paul Jensen - Vice President, Engineering Operations

    All posts by this author
    Paul Jensen is Vice President, Engineering Operations at Aerospike. He is a technology industry veteran with over 25 years of experience. Prior to Aerospike, he held positions at companies including TiVo, MovieLabs, Microsoft, and Liberate Technologies.