Batch operations
Overviewโ
This page describes Aerospike's batch feature; the advantages, considerations, and read, write, and delete operations. This page includes code examples for these operations.
What is a batchโ
A batch is a series of requests that are sent together to the database server. A batch groups multiple operations into one unit and passes it in a single network trip to each database node.
Batch advantagesโ
- Batches combine multiple record operations, including updates, deletes, reads and UDFs.
- Batching allows any key-value operation, such as mixing record-level gets and deletes, as well as bin-level transaction operations including increment, prepend/append, Map and List operations, and bitwise operations.
- When batching multiple updates, the number of connections between client and the server is reduced.
- Batches optimize the use of network resources, such as, packets and network sockets.
- By increasing throughput, batch can more efficiently use network resources in some cases.
- Backward compatible for applications using batch-read, batch-operate, and batch-exists.
- Stores and retrieves multiple data points.
Batch use casesโ
- Financial services (High performance computing)
- Digital Media (Rendering and transcoding)
- Internet of Things (Ingesting and analyzing IoT sensor data)
Batch workflowโ
The client groups the primary keys in the batch by a cluster node and creates a sub-batch request to each node. Batch requests occur in series or in parallel depending on the batch policy. Parallel requests in synchronous mode require extra threads, which are created or taken from a thread pool.
Batch requests use a single network socket to each cluster node, which helps with parallelizing requests. Multiple keys use one network request, which is beneficial for a large number of small records, but not as beneficial when the number of records per node is small, or the data returned per record is very large. Batch requests sometimes increase the latency of some requests, but only because clients normally wait until all the keys are retrieved from the cluster nodes before it returns control to the caller.
Some clients, such as the C client, deliver each record as soon as it arrives, allowing client applications to process data from the fastest server first. In the Java client, sync batch and async batch with RecordArrayListener
wait until all responses arrive from the cluster nodes. Async batch calls the RecordSequenceListener
to send back records one at a time as they are received.
Batch considerationsโ
- Batches are not multi record transactions.
- There is no guarantee for the order of execution across nodes. The operations in a sub-batch can be processed in order, a behavior controlled by a batch policy flag. Otherwise, the operations in a sub-batch execute in parallel on the node.
- There is no rollback processing for failed operations in a batch.
- You can configure the batch policy to stop or continue batch processing if an operation fails.
- Use the
operate
command to combine multiple changes with the same key in a single batch transaction. - Starting with Database 6.0, you can combine any type of read and write operations against one or more keys.
- Batch transactions will use more resources than single record transactions end-to-end (client and server) unless batches are really large. We recommend that you use single record transactions with smaller batch sizes.
Batch operationsโ
Inlining batchesโ
If inline is set to true
through flags in the batch policy, the node executes every operation in its sub-batch, one after the other.
- Batch Policy
allowInline
controls whether to inline sub-batch operations where the keys are in an in-memory namespace. The default value istrue
. - Batch Policy
allowInlineSSD
controls whether to inline sub-batch operations where the keys are in an SSD based namespace. The default value isfalse
. - For batch transactions with smaller records, for example, 1KiB per record, using inline processing is faster for in-memory namespaces.
- If it is not inline, the sub-batch operations are split and executed by different threads.
- Inlining sub-batches does not tend to improve latency when the keys are in an SSD-based namespace. You should benchmark to compare performance.
- When a sub-batch is inlined, one thread executes the operations. The thread is not released until the sub-batch processing is complete. Large inlined batches may divert server resources toward batch operations over single-record operations.
Filtering batch operationsโ
You can attach a filter expression to any batch operation. The server applies the filter to each record in the batch to determine whether the operation should proceed.
Batch readsโ
- Batch
get
: Reads multiple records, or optionally a selection of bins from those records. - Batch
exists
: Verifies metadata whether the specified keys exist. - Batch
getHeader
: Reads record metadata (expiration/TTL and generation) only. It does not return bin data. - Batch
operate
: Executes a transaction of read operations against multiple records.
Batch writesโ
- Aerospike release 6.0 introduces batch writes in addition to batch reads. This
includes updates, deletes, UDFs, and multi-operation transactions (
operate
) without limits on write operations. - Batch writes allow write operations against any keys.
- Batch writes can process large numbers of updates in a single operation using less connections.
Multiple batch operations to the same key in a batchโ
- Unless you inline the batch request to be serviced by the same service thread, multiple batch ops to same key, such as [K1, K3...K1, K5....K1, K2], will get distributed to different service threads.
- Client library or the server cannot decipher your intent of operations on K1 and consolidate them into one batch transaction on K1.
- Each operation on K1 can proceed, potentially, out of order when distributed to different service threads.
If you don't want this situation, either consolidate operations for the same key into one key entry in the batch, or inline the batch request. ย
Code examples of batch operationsโ
The following examples use a Java
client to perform read, write, delete, exist, and mixed read/write operations on a batch.
The syntax differs per Aerospike client, based on the needs of the operating system.
Example: Batch read recordsโ
/**
* Read records in one batch.
*/
private void batchReads (
AerospikeClient client,
Parameters params,
String keyPrefix,
String binName,
int size
) throws Exception {
// Batch gets into one call.
Key[] keys = new Key[size];
for (int i = 0; i < size; i++) {
keys[i] = new Key(params.namespace, params.set, keyPrefix + (i + 1));
}
Record[] records = client.get(null, keys, binName);
for (int i = 0; i < records.length; i++) {
Key key = keys[i];
Record record = records[i];
Level level = Level.ERROR;
Object value = null;
if (record != null) {
level = Level.INFO;
value = record.getValue(binName);
}
console.write(level, "Record: ns=%s set=%s key=%s bin=%s value=%s",
key.namespace, key.setName, key.userKey, binName, value);
}
if (records.length != size) {
console.error("Record size mismatch. Expected %d. Received %d.", size, records.length);
}
}
Example: Batch read operationโ
private void batchListOperate(AerospikeClient client, Parameters params) {
console.info("batchListOperate");
Key[] keys = new Key[RecordCount];
for (int i = 0; i < RecordCount; i++) {
keys[i] = new Key(params.namespace, params.set, KeyPrefix + (i + 1));
}
// Get size and last element of list bin for all records.
Record[] records = client.get(null, keys,
ListOperation.size(BinName3),
ListOperation.getByIndex(BinName3, -1, ListReturnType.VALUE)
);
for (int i = 0; i < records.length; i++) {
Record record = records[i];
//System.out.println(record);
List<?> results = record.getList(BinName3);
long size = (Long)results.get(0);
Object val = results.get(1);
console.info("Result[%d]: %d,%s", i, size, val);
}
}
Example: Batch read/write operationsโ
/*
* Perform list read/write operations in one batch.
*/
private void batchListWriteOperate(AerospikeClient client, Parameters params) {
console.info("batchListWriteOperate");
Key[] keys = new Key[RecordCount];
for (int i = 0; i < RecordCount; i++) {
keys[i] = new Key(params.namespace, params.set, KeyPrefix + (i + 1));
}
// Add integer to list and get size and last element of list bin for all records.
BatchResults bresults = client.operate(null, null, keys,
ListOperation.append(ListPolicy.Default, BinName3, Value.get(999)),
ListOperation.size(BinName3),
ListOperation.getByIndex(BinName3, -1, ListReturnType.VALUE)
);
for (int i = 0; i < bresults.records.length; i++) {
BatchRecord br = bresults.records[i];
Record rec = br.record;
if (rec != null) {
List<?> results = rec.getList(BinName3);
long size = (Long)results.get(1);
Object val = results.get(2);
console.info("Result[%d]: %d,%s", i, size, val);
}
else {
console.info("Result[%d]: error: %s", i, ResultCode.getResultString(br.resultCode));
}
}
}
Example: Batch exist operationโ
/*
* Check existence of records in one batch.
*/
private void batchExists (
AerospikeClient client,
Parameters params,
String keyPrefix,
int size
) throws Exception {
// Batch into one call.
Key[] keys = new Key[size];
for (int i = 0; i < size; i++) {
keys[i] = new Key(params.namespace, params.set, keyPrefix + (i + 1));
}
boolean[] existsArray = client.exists(null, keys);
for (int i = 0; i < existsArray.length; i++) {
Key key = keys[i];
boolean exists = existsArray[i];
console.info("Record: ns=%s set=%s key=%s exists=%s",
key.namespace, key.setName, key.userKey, exists);
}
}
Example: Batch delete operationโ
public void batchDelete() {
// Define keys
Key[] keys = new Key[] {
new Key(args.namespace, args.set, 10000),
new Key(args.namespace, args.set, 10001)
};
// Ensure keys exists
boolean[] exists = client.exists(null, keys);
assertTrue(exists[0]);
assertTrue(exists[1]);
// Delete keys
BatchResults br = client.delete(null, null, keys);
assertTrue(br.status);
// Ensure keys do not exist
exists = client.exists(null, keys);
assertFalse(exists[0]);
assertFalse(exists[1]);
}
Example: Batch read/write mixed operationโ
/*
* Read/Write records using varying operations in one batch.
*/
private void batchWriteOperateComplex(AerospikeClient client, Parameters params) {
console.info("batchWriteOperateComplex");
Expression wexp1 = Exp.build(Exp.add(Exp.intBin(BinName1), Exp.intBin(BinName2), Exp.val(1000)));
Expression rexp1 = Exp.build(Exp.mul(Exp.intBin(BinName1), Exp.intBin(BinName2)));
Expression rexp2 = Exp.build(Exp.add(Exp.intBin(BinName1), Exp.intBin(BinName2)));
Expression rexp3 = Exp.build(Exp.sub(Exp.intBin(BinName1), Exp.intBin(BinName2)));
Operation[] ops1 = Operation.array(
Operation.put(new Bin(BinName4, 100)),
ExpOperation.read(ResultName1, rexp1, ExpReadFlags.DEFAULT));
Operation[] ops2 = Operation.array(ExpOperation.read(ResultName1, rexp1, ExpReadFlags.DEFAULT));
Operation[] ops3 = Operation.array(ExpOperation.read(ResultName1, rexp2, ExpReadFlags.DEFAULT));
Operation[] ops4 = Operation.array(
ExpOperation.write(BinName1, wexp1, ExpWriteFlags.DEFAULT),
ExpOperation.read(ResultName1, rexp3, ExpReadFlags.DEFAULT));
Operation[] ops5 = Operation.array(
ExpOperation.read(ResultName1, rexp2, ExpReadFlags.DEFAULT),
ExpOperation.read(ResultName2, rexp3, ExpReadFlags.DEFAULT));
List<BatchRecord> records = new ArrayList<BatchRecord>();
records.add(new BatchWrite(new Key(params.namespace, params.set, KeyPrefix + 1), ops1));
records.add(new BatchRead(new Key(params.namespace, params.set, KeyPrefix + 2), ops2));
records.add(new BatchRead(new Key(params.namespace, params.set, KeyPrefix + 3), ops3));
records.add(new BatchWrite(new Key(params.namespace, params.set, KeyPrefix + 4), ops4));
records.add(new BatchRead(new Key(params.namespace, params.set, KeyPrefix + 5), ops5));
records.add(new BatchDelete(new Key(params.namespace, params.set, KeyPrefix + 6)));
// Execute batch.
client.operate(null, records);
// Show results.
int i = 0;
for (BatchRecord record : records) {
Record rec = record.record;
if (rec != null) {
Object v1 = rec.getValue(ResultName1);
Object v2 = rec.getValue(ResultName2);
console.info("Result[%d]: %s, %s", i, v1, v2);
}
else {
console.info("Result[%d]: error: %s", i, ResultCode.getResultString(record.resultCode));
}
i++;
}
}
Log examplesโ
A batch information coming from batch-sub. Stats on operations including batches for the namespace called a test.
Ticker log line changes:
\\ Example:
{test} batch-sub: tsvc (0,0) proxy (0,0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)
When the cluster size changes you might also see proxy events included in a batch.
\\Example:
{test} from-proxy-batch-sub: tsvc (0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)
Batch specific errorsโ
Value | Error | Description |
---|---|---|
150 | AS_ERR_BATCH_DISABLED | Batch functionality has been disabled by configuring batch-index-threads to 0 |
152 | AS_ERR_BATCH_QUEUES_FULL | All batch queues are full. Controlled by the batch-max-buffers-per-queue configuration parameter |
Refer to Error Codes.
Known Issues or Limitationsโ
Batch writes were not supported prior to the Database 6.0.
Required Client Versionsโ
For full compatibility the following clients are required:
- Java client 6.0.0 or later
- Go client 6.0.0 or later
- C client 6.0.0 or later
- C# client 5.0.0 or later
Client Referencesโ
Refer to these topics for language-specific code examples: