Stream UDFs
Use Aerospike Stream UDFs to filter, transform, and aggregate query results in a distributed fashion. Stream UDFs can process a stream of data by defining a sequence of operations to perform. Stream UDF aggregations run in a distributed fashion to exploit the power of multiple machines.
Stream UDFs perform read-only operations on a collection of records. The Stream UDF system allows a MapReduce style of programming, which is used for common, highly parallel jobs such as word countโeach row is accessed and a list of words and their counts are emitted. The top results are calculated in a reduce phase. For simple aggregations where counters are simply incremented in a context instead of continually creating and destroying objects, Aerospike provides optimal implementation.
Operatorsโ
Aerospike Stream UDFs support these operators:
filter
โFilter data in the stream that satisfies a predicate.map
โTransform a piece of data.aggregate
โReduce partitions of data to a single value.reduce
โAllow parallel processing of each group of output data.
These operators are considered primitives and can build complex operations (see Developing Stream UDFs).
filterโ
Filter values in the stream using the predicate p
. p
tests each value in the stream and returns true if the value should be passed through or false to drop the value.
Example
filter(p: (a: Value) -> Boolean) -> Stream
Parameters
p
โThe predicate to apply to each value in the stream.
Returns
A stream of filtered values.
mapโ
Transforms a value to a different value using the identify function f
.
map(f: (a: Value) -> Value) -> Stream
Parameters
f
โThe identify function to apply to each value in the stream.
Returns
A stream of the transformed values.
reduceโ
Reduce the values in the stream to a single value and apply the associative binary operator op
to each value. The op
return value is the parameter a
for subsequent calls to op
.
reduce(op: (a: Value, b: Value) -> Value) -> Stream
Parameters
op
โThe associative binary operation to apply to each value in the stream.
Returns
A stream containing a single value.
aggregateโ
Takes the current value (parameter x
) and the next value in the input stream, and returns the aggregate value (parameter a
).
aggregate(x: Value, op: (a: Value, b: Value) -> Value) -> Stream
Parameters
x
โThe initial (neutral) value passed to the operatorop
.op
โThe identify function to apply to each value in the stream.
Returns
A stream containing a single value.
Examplesโ
See Stream UDF Examples.