This notebook describes how expressions work in Aerospike: how they are
formed, their syntax, benefits, and how they are used in filters and
operations.
This notebook requires the Aerospike Database running locally with Java kernel and Aerospike Java Client. To create a Docker container that satisfies the requirements and holds a copy of Aerospike notebooks, visit the Aerospike Notebooks Repo.
Introduction
In this notebook, we will see how expressions work in Aerospike and
benefits they provide.
The expressions functionality has been enhanced in Aerospike Database
5.6. Expressions appear in two flavors in the client library: Filter
Expressions and Operation Expressions. Filter Expressions provide a
mechanism to select records for operations and replace Predicate
Expressions, which have been deprecated since the 5.2 release. Operation
Expressions enable new read and write capabilities as described later.
Expressions are also used on server as XDR Filter Expressions to specify
which records are shipped to remote destinations.
We will describe at a high level how expressions are formed in Aerospike
and the capabilities they enable. After highlighting key syntax
patterns, we will show with specific code examples how expressions are
used.
The main topics in this notebook include:
scope of expressions
benefits
syntax
usage
coding examples
Prerequisites
This tutorial assumes familiarity with the following topics:
System.out.println("Initialized the client and connected to the cluster.");
finalStringNamespace="test";
finalStringSet="expressions";
// convenience functions to truncate test data
voidtruncateTestData() {
try {
client.truncate(null, Namespace, Set, null);
}
catch(AerospikeExceptione) {
// ignore
}
}
Output:
Initialized the client and connected to the cluster.
Access Shell Commands
You may execute shell commands including Aerospike tools like
aql and
asadm in the
terminal tab throughout this tutorial. Open a terminal tab by selecting
File->Open from the notebook menu, and then New->Terminal.
Defining Expressions
An expression is a syntactic entity in a programming language that may
be evaluated to determine its
value.(Wikipedia)
In other words, an expression evaluates to (or returns) a value. Some simple examples of an expression would be:
5
7 + 3
2 > 1
Expressions can have:
constants:
5, “horse”, [1, 2, 3]
variables:
var x = pow(b, c) + d
functions:
pow, mod, min
and operators:
==, +, or
Expressions are composable. In other words, complex expressions can be formed from simpler expressions. For example:
1 + min(2, a + 2) < sqrt(b)
An expression is not an assignment: An expression does not assign a
value to a variable, but simply evaluates to a value which may be used
in an assignment statement that assigns the value to a variable.
Expressions in Aerospike
This section provides a higher level view of the capabilities and
workings of expressions in Aerospike. The subsequent sections will drill
down into the details.
Evaluation Context
Expressions are evaluated on server for filtering conditions, reading
and writing to bins, and configuring XDR replication. Therefore, an
expression only works on server data entities such as the metadata and
record data, and uses any constants that the client may provide. When
used from the client library, expressions are created on the client and
sent to the server in an API operation. Before sending, the client
object format of an expression is converted to a wire format using the
build operation.
Components and Scope
An expression is a combination of one or more constants, variables,
functions, and operators that the programming language interprets …
and computes to produce another value.(Wikipedia)
In Aerospike, expressions use bins and metadata as variables, metadata
and API functions, and values that are strongly typed as boolean,
integer, float, string, list, map, blob, GeoJSON, or HyperLogLog. A host
of arithmetic, logical, convenience, and API operations are available
for these data.
Please refer to the
documentation for
the list of supported components.
Immutability of Components
In Aerospike, an expression works on a transient copy, therefore
evaluating an expression does not change the metadata or bins that are
used in the expression.
Use of Variables
A variable can be defined to represent a sub-expression for syntactic clarity and efficiency. A variable is first defined and initialized by assigning it to an expression, and then used as a substitute for the expression. In the example below, a variable myvar is defined and used in an expression myexpr:
myvar = (a + b) / min(a, b)
myexpr = myvar + 1 / myvar
Conditional Evaluation
An expression can be conditionally evaluated with an if-then-else like construct. For example:
The functionality of expressions is the same, although the context
determines their use. For example, Filter and XDR Filter Expressions are
boolean expressions, whereas Operation Expressions can evaluate to any
supported type.
Only Filter and Operation Expressions can be used in the client library
and therefore will be the focus of this tutorial. Please refer to the
documentation
for the details of XDR Filter Expressions.
Benefits of Expressions
Here are some key benefits and capabilities that expressions enable:
Capabilities in expressions include:
variables for syntactic clarity and efficiency,
conditional evaluation,
access to metadata and bin data, and
access to powerful APIs and enhanced set of operators.
The enhanced filtering expressions allow records to be processed
more efficiently by avoiding the need for potentially more expensive
client or UDF based processing.
Reads and writes are now possible with Operation Expressions.
in reads, this can eliminate the need to bring large amounts of
data to the client with more precise ability to specify the data
to be fetched.
a bin can be updated with the results of an expression, which
can eliminate having to read before update by allowing
everything to happen on the server side in the same request
including the read, processing for update, and update. This
saves a round-trip and transfer of potentially large data. In a
concurrent setting, this also avoids retries due to conflicts
see the R-M-W pattern.
Multi-step operations that can build on each other’s results are now
possible through operation expressions.
Syntax Details
With a better understanding of their structure, it is easier to parse
Aerospike expressions.
Notation
Aerospike expressions use Polish Notation (aka prefix notation) which is widely seen in most programming language functions: fn(a, b). So the expression 5 + 3 in Aerospike Java client would be:
Exp.add(
Exp.val(5),
Exp.val(3))
Note, the overloaded val method converts all supported types to a Value object, which provides an abstraction for all supported value types.
Composition
A complex expression can be composed using two or more sub-expressions. For example, with integer bins a and b, the expression (a - b) / (a + b) would be:
Exp.div(
Exp.sub(
Exp.intBin("a"),
Exp.intBin("b")),
Exp.add(
Exp.intBin("a"),
Exp.intBin("b")))
Note, there are corresponding access methods to access bin values for other supported types. Since a bin may hold any value type, an incorrect type access results in an error. A conditional type check may be used to prevent a run-time error.
Variable Definition and Use
The let construct defines the scope of variables and the expression that uses them. The def construct defines a variable and assigns it to an expression. Another expression in the scope can use the variable as a substitute for the expression it defines. For example, in the expression 5 < (a + b) < 10 using a variable x for the sum of integer bins a and b:
Exp.let( // let defines the scope of variables for this expression
Exp.def("x", // def defines a variable
Exp.sum( // and also assigns it to an expression
Exp.intBin("a"),
Exp.intBin("b")),
Exp.and( // the expression in let scope can use the variable
Exp.lt(
Exp.val(5),
Exp.var("x")), // var to use the variable
Exp.lt(
Exp.var("x"),
Exp.val(10))));
Note in the above example, the variable x avoids repetitive access to the bins a and b. Also, variables defined in let cannot be used beyond its scope.
Conditional Evaluation
The cond construct includes one or more pairs of bool exp, value exp followed by a default value:
bool exp1, value exp1, bool exp2, value exp2, ..., default-value
It evaluates like the if-then-else logic: the expression takes the value of the first value exp in the sequence whose corresponding bool exp evaluates to true. If all boolean conditions fail, then it evaluates to the last default-value.
So an expression to evaluate a simple risk value “high” or “normal” based on int bin age and bool bin comorbidities would be:
// if (age > 65 && comorbidities) {risk = "high";}
// else {risk = "normal";}
Exp.cond(
Exp.and(
Exp.gt(
Exp.intBin("age"), Exp.val(65)),
Exp.boolBin(comorbidities)),
Exp.val("high"),
Exp.val("normal"));
Useful Syntax Patterns
Here is a table that summarizes some useful expression syntax patterns.
An expression object is constructed on the client to be sent to the
server where it is evaluated and used.
An expression’s wire protocol representation is constructed with the build() function. A simple expression fname == "Frank" will be built thus:
Expression simpleExp = Exp.build(
Exp.eq(
Exp.stringBin("fname"),
Exp.val("Frank")));
Note the wire protocol representation of expression is of type
Expression, whereas a client object is of type Exp.
An expression can be used as a filter expression or an operation
expression, as described below.
Both filter and operation expressions can be used independently of each
other and also in the same API call.
Filter Expressions
Filter expressions are so named because they are used as a condition to select or discard a record. They always evaluate to a boolean value to indicate whether the record is selected (true) or filtered out (false). A filter expression, as the deprecated Predicate Expression, is sent to the server through the API’s policy object parameter.
Policy policy = new Policy();
policy.filterExp = Exp.build( // sent through filterExp attribute of policy
Exp.eq(
Exp.intBin("a"),
Exp.val(11)));
...
client.query(policy, stmt) // policy is specified as a parameter in API calls
Operation Expressions
Operation expressions as the name suggests are used in an operation -
either to read from bins or write to a bin. Specifically they are used
in read and write methods of ExpOperation.
The basic computational model of operate, where operation expressions
are used, remains the same: A series of read or write commands are
performed in a given sequence on a single record. What is new is that a
read command can be an expression involving zero or more bins. Also, a
write command can get the value from an expression (enabling, for
example, use of cross-bin data with conditional logic) instead of a
simple constant to update a bin.
A read with operation expression can also use an arbitrary name for the
“computed bin” similar to the “as” keyword in the SQL statement
SELECT expr AS bin.
The pattern for coding an Operation Expression is:
Define Expression to read or write the bins.
Use Expression object in ExpOperation.read or .write method
that returns an Operation.
Use “expression operations” in any API call that takes an operation
list.
This is illustrated below.
// operate expression with write
// 1. Define Expression to write the bin with.
// if (age > 65 && comorbidities) {risk = "high";}
// else {risk = "normal";}
Expression writeExp = Exp.build(
Exp.cond(
Exp.and(
Exp.gt(
Exp.intBin("age"), Exp.val(65)),
Exp.boolBin(comorbidities)),
Exp.val("high"),
Exp.val("normal")));
// 2. Use Expression object in ExpOperation.write method.
Operation writeExpOp = ExpOperation.write("risk",
Expression writeExp, // evaluates bin value to update
ExpWriteFlags.DEFAULT);
// operate expression with read
// 1. Define Expression to read bins.
// read "yes" if (risk == "high" or worktype == "frontline") else "no"
// 2. Use Expression object in ExpOperation.read method.
Operation readExpOp = ExpOperation.read("eligible", // named "computed bin"
Expression readExp, // evaluates value to return
ExpReadFlags.DEFAULT);
// 3. Use "expression operations" in any API call that takes an operation list.
Record record = Client.operate(WritePolicy policy, Key key, Operation writeExpOp, Operation readExpOp);
Code Examples
Below are code examples that illustrate the expression features
described above.
Filter Expressions
The following example illustrates the capabilities of filtering on
metadata and use of List APIs (neither are possible with the deprecated
predicate expressions).
In this illustrative example the filter selects:
recently updated (sinceUpdate < 2) records
with list bin having values that range from max to min
greater than 1000.
Populate the test data with 20 records with an integer bin “bin1”
values 1-20 and a list bin having 3 randomly selected numbers in the
range 1 to 2000.
Sleep for 2 seconds,
Touch the even numbered records.
Run the query with the filter.
The results should only contain even valued bin1 and bin2 with value
range > 1000.
importjava.util.ArrayList;
importjava.util.Random;
importcom.aerospike.client.AerospikeException;
importcom.aerospike.client.Bin;
importcom.aerospike.client.Key;
importcom.aerospike.client.policy.WritePolicy;
importcom.aerospike.client.policy.QueryPolicy;
importcom.aerospike.client.exp.Exp;
importcom.aerospike.client.exp.ListExp;
importcom.aerospike.client.Operation;
importcom.aerospike.client.task.ExecuteTask;
importcom.aerospike.client.query.Statement;
importcom.aerospike.client.query.RecordSet;
importcom.aerospike.client.Record;
importcom.aerospike.client.cdt.ListReturnType;
// start with a clean state
truncateTestData();
// 1. Populate the test data with 20 records with an integer bin "bin1" values 1-20
// and a list bin having 3 randomly selected numbers in the range 1 to 2000.
Results of filter expression query (all even records with bin2 max-min > 1000):
key=id-4 bins={bin1=4, bin2=[1748, 569, 473]}
key=id-10 bins={bin1=10, bin2=[153, 1437, 1302]}
key=id-18 bins={bin1=18, bin2=[333, 1676, 55]}
key=id-16 bins={bin1=16, bin2=[592, 220, 1888]}
You may view the state of the database and ensure correctness of the
output by running the following command in the terminal tab:
aql -c "select * from test.expressions"
Operation Expressions
In the following example, these new capabilities that were not possible
earlier are illustrated:
expressions involving zero or more bins to write a bin
named “computed bins” that return the value of a specified
expression involving zero or more bins
conditional evaluation of expression
use of variables in an expression
The code has the following steps:
The test data is populated with three randomly generated test scores
ranging from 50 to 100 for student ids 1-20.
The data is updated by writing two additional bins: “class” which
represents the teacher’s input (0-10) based on class participation,
and “grade” which is computed by adding “classwork” to average of
test scores, and using this formula to compute the grade: 50-70 ->
C, 65==70-85 -> B, 85+ -> A.
A report is then generated for the id, grade, total score, and
min/max/average of test scores.
importcom.aerospike.client.exp.Expression;
importcom.aerospike.client.exp.ExpOperation;
importcom.aerospike.client.exp.ExpReadFlags;
importcom.aerospike.client.exp.ExpWriteFlags;
// start with a clean state
truncateTestData();
// 1. The test data is populated with three randomly generated test scores ranging from 50 to 100
You may view the state of the database and ensure correctness of the
output by running the following command in the terminal tab:
aql -c "select * from test.expressions"
Using Expression Operations vs R-M-W or UDFs
Aerospike developers have multiple ways to perform a record oriented
read-write logic.
Read record data to the client, modify, and write back (“R-M-W”).
Create a UDF for the logic and invoke it on the record.
Use expression operations in a multi-op request.
For read-write transactions, fetching the data to the client and writing
back is expensive and requires special care to ensure read-write
isolation. Lua UDFs can be difficult to implement, less flexible to
change, and can be slower. So it is generally beneficial to use
expression operations when possible.
Here is a suggested decision process:
Use expression operations. However if expression operations cannot
be used because the task, for example, requires unsupported features
such as iterators and loops, then:
Use client-side Read-Modify-Write (R-M-W) with version check if
amount of data transfer as well as possibility of conflict due to
concurrency is limited. Otherwise:
Use UDFs if Lua server side programming model and performance meet
the needs. Otherwise must use 2.
Note, Aerospike provides many ways to implement a given data task on one
or multiple records. To determine the optimal way for a given task, one
should consider and evaluate the options available including the various
execution modes (synchronous, asynchronous, background, etc).
Usage Notes
Policy currently allows both the deprecated predExp and new
filterExp, but they are mutually exclusive. If both are specified,
only filterExp will be used and predExp will be ignored.
Errors during evaluation:
Errors such as type match and bin existence can be checked using cond to avoid run time evaluation errors.
Exp.cond(
Exp.eq( // check if the bin is of type int
Exp.binType("a"),
Exp.val(ParticleType.INTEGER)),
Exp.eq( // perform int comparison
Exp.intBin("a"),
Exp.val(1)),
Exp.val(false)); // default is false
Filter expressions treat the final unknown value as false,
whereas in operation expressions it results in an error.
If appropriate, evaluation failure can be ignored while
performing multiple operate operations by setting the flags
argument in ExpOperation.read or .write to
ExpReadFlags.EVAL_NO_FAIL or ExpWriteFlags.EVAL_NO_FAIL
respectively.
Constructs like loops and iterators over record bins or CDT elements
are not currently supported. General manipulation of data beyond
what is available in the APIs also is not supported.
Takeaways and Conclusion
The tutorial described expressions capabilities in Aerospike. It
explained the scope and syntax, and described the key components and
constructs. It provided code examples for how to work with expressions
in two client uses: filter expressions and operation expressions.
The enhanced capabilities in filtering expressions allow records to be
processed more efficiently by avoiding the need for more expensive
client or udf based processing. New capabilities include access to
metadata, bin data, powerful APIs, as well as enhanced arithmetic and
other operators.
Operation expressions can eliminate the need to read before update by
allowing read, processing for update, and update to happen on the server
side in the same request. This saves a round-trip and transfer of
potentially large data.
Expressions provide powerful capabilities; evaluate and use them if they
are suitable and provide better performance for your use case over UDFs
and client-side processing.
Cleaning Up
Remove tutorial data and close connection.
truncateTestData();
client.close();
System.out.println("Removed tutorial data and closed server connection.");
Output:
Removed tutorial data and closed server connection.
Visit Aerospike notebooks repo to
run additional Aerospike notebooks. To run a different notebook,
download the notebook from the repo to your local machine, and then
click on File->Open in the notebook menu, and select Upload.