Batch operations
Overviewโ
This page describes Aerospike's batch feature; the advantages, considerations, and read, write, and delete operations. This page includes code examples for these operations.
What is a batchโ
A batch is a series of requests that are sent together to the database server. A batch groups multiple operations into one unit and passes it in a single network trip to each database node.
Batch advantagesโ
- Batches combine multiple record operations, including updates, deletes, reads and UDFs.
- Batching allows any key-value operation, such as mixing record-level gets and deletes, as well as bin-level transaction operations including increment, prepend/append, Map and List operations, and bitwise operations.
- When batching multiple updates, the number of connections between client and the server is reduced.
- Batches optimize the use of network resources, such as, packets and network sockets.
- By increasing throughput, batch can more efficiently use network resources in some cases.
- Backward compatible for applications using batch-read, batch-operate, and batch-exists.
- Stores and retrieves multiple data points.
Batch use casesโ
- Financial services (High performance computing)
- Digital Media (Rendering and transcoding)
- Internet of Things (Ingesting and analyzing IoT sensor data)
Batch workflowโ
The client groups the primary keys in the batch by a cluster node and creates a sub-batch request to each node. Batch requests occur in series or in parallel depending on the batch policy. Parallel requests in synchronous mode require extra threads, which are created or taken from a thread pool.
Batch requests use a single network socket to each cluster node, which helps with parallelizing requests. Multiple keys use one network request, which is beneficial for a large number of small records, but not as beneficial when the number of records per node is small, or the data returned per record is very large. Batch requests sometimes increase the latency of some requests, but only because clients normally wait until all the keys are retrieved from the cluster nodes before it returns control to the caller.
Some clients, such as the C client, deliver each record as soon as it arrives, allowing client applications to process data from the fastest server first. In the Java client, sync batch and async batch with RecordArrayListener
wait until all responses arrive from the cluster nodes. Async batch calls the RecordSequenceListener
to send back records one at a time as they are received.
Batch considerationsโ
- Batches are not multi record transactions.
- There is no guarantee for the order of execution across nodes. The operations in a sub-batch can be processed in order, a behavior controlled by a batch policy flag. Otherwise, the operations in a sub-batch execute in parallel on the node.
- There is no rollback processing for failed operations in a batch.
- You can configure the batch policy to stop or continue batch processing if an operation fails.
- Use the
operate
command to combine multiple changes with the same key in a single batch transaction. - Starting with Database 6.0, you can combine any type of read and write operations against one or more keys.
- Batch transactions will use more resources than single record transactions end-to-end (client and server) unless batches are really large. We recommend that you use single record transactions with smaller batch sizes.
Batch operationsโ
Inlining batchesโ
If inline is set to true
through flags in the batch policy, the node executes every operation in its sub-batch, one after the other.
- Batch Policy
allowInline
controls whether to inline sub-batch operations where the keys are in an in-memory namespace. The default value istrue
. - Batch Policy
allowInlineSSD
controls whether to inline sub-batch operations where the keys are in an SSD based namespace. The default value isfalse
. - For batch transactions with smaller records, for example, 1KiB per record, using inline processing is faster for in-memory namespaces.
- If it is not inline, the sub-batch operations are split and executed by different threads.
- Inlining sub-batches does not tend to improve latency when the keys are in an SSD-based namespace. You should benchmark to compare performance.
- When a sub-batch is inlined, one thread executes the operations. The thread is not released until the sub-batch processing is complete. Large inlined batches may divert server resources toward batch operations over single-record operations.
Filtering batch operationsโ
You can attach a filter expression to any batch operation. The server applies the filter to each record in the batch to determine whether the operation should proceed.
Batch readsโ
- Batch
get
: Reads multiple records, or optionally a selection of bins from those records. - Batch
exists
: Verifies metadata whether the specified keys exist. - Batch
getHeader
: Reads record metadata (expiration/TTL and generation) only. It does not return bin data. - Batch
operate
: Executes a transaction of read operations against multiple records.
Batch writesโ
- Aerospike release 6.0 introduces batch writes in addition to batch reads. This
includes updates, deletes, UDFs, and multi-operation transactions (
operate
) without limits on write operations. - Batch writes allow write operations against any keys.
- Batch writes can process large numbers of updates in a single operation using less connections.
Multiple batch operations to the same key in a batchโ
- Unless you inline the batch request to be serviced by the same service thread, multiple batch ops to same key, such as [K1, K3...K1, K5....K1, K2], will get distributed to different service threads.
- Client library or the server cannot decipher your intent of operations on K1 and consolidate them into one batch transaction on K1.
- Each operation on K1 can proceed, potentially, out of order when distributed to different service threads.
If you don't want this situation, either consolidate operations for the same key into one key entry in the batch, or inline the batch request. ย
Code examples of batch operationsโ
The following examples use the Java client to perform read, write, delete, exist, and mixed read/write operations on a batch. The syntax differs per Aerospike client, based on the needs of the operating system.
Setupโ
AerospikeClient client = new AerospikeClient("localhost", 3000);
// Aerospike namespace, set, and key_ids to be used for the Aerospike keys
String namespace = "sandbox";
String set = "ufo";
String binName = "sighting";
List<Integer> keyIds = IntStream.range(5002, 5009).boxed().toList();
Key[] keys = keyIds.stream().map(id -> new Key(namespace, set, id)).toArray(Key[]::new);
String sightings =
"[{\"sighting\":{\"occurred\":20200912,\"reported\":20200916,\"posted\":20201105,\"report\":{\"city\":\"Kirkland\",\"duration\":\"~30 minutes\",\"shape\":[\"circle\"],\"state\":\"WA\",\"summary\":\"4 rotating orange lights in the Kingsgate area above the Safeway. Around 9pm the power went out in the Kingsgate area. Four lights were spotted rotating above the local Safeway and surrounding streets. They were rotating fast but staying relatively in the same spots. Also described as orange lights. About thirty minutes later they disappeared. The second they disappeared the power was restored. Later a station of police from Woodinville and Kirkland came to guard the street where it happened. They wouldn't let anyone go past the street, putting out search lights and flare signals so people couldn't drive past Safeway. The police also would not let people walk past to go home.\"},\"location\":\"\\\"{\\\"type\\\":\\\"Point\\\",\\\"coordinates\\\":[-122.1966441,47.69328259]}\\\"\"}},\n" +
"{\"sighting\":{\"occurred\":20200322,\"reported\":20200322,\"posted\":20200515,\"report\":{\"city\":\"Pismo Beach\",\"duration\":\"5 minutes\",\"shape\":[\"light\"],\"state\":\"CA\",\"summary\":\"About 20 solid, bright lights moving at the same altitude, heading and speed. Spaced perfectly apart flying over the ocean headed south.\"},\"location\":\"\\\"{\\\"type\\\":\\\"Point\\\",\\\"coordinates\\\":[-120.6595,35.1546]}\\\"\"}},\n" +
"{\"sighting\":{\"occurred\":20200530,\"reported\":20200531,\"posted\":20200625,\"report\":{\"city\":\"New York Staten Island\",\"duration\":\"2 minutes\",\"shape\":[\"disk\"],\"state\":\"NY\",\"summary\":\"Round shaped object observed over Staten Island NYC, while sitting in my back yard. My daughter also observed this object . Bright White shaped object moving fast from East to West . Observed over Graniteville, Staten Island towards the Elizabeth NJ area and appears to be fast. We then lost view of it due to the clouds.\"}}},\n" +
"{\"sighting\":{\"occurred\":20200402,\"reported\":20200403,\"posted\":20200625,\"report\":{\"city\":\"Phoenix\",\"duration\":\"2+ hours\",\"shape\":[\"sphere\"],\"state\":\"AZ\",\"summary\":\"Pulsating, sphericalm multi ringed, stationary object that I viewed for 2 plus hours. Looking up I saw a extremely bright star in the West (slightly North) sky. I watched it for about 10 minutes and kept my eye on the three faint stars (2 above and 1 to the right) to check if it was moving, thinking it could be a possible plane, although I knew that the constant light and brightness wouldn't be a plane. I decided to take a video with commentary to post on my FB page to friends. That is the first video. The second and third are video after I honed in on it and finally enlarged the sphere to the phones maximum capacity. These video's are the result. It was approximately two hours of observation without the object moving at all (I kept the reference stars that were near in sight). I stepped into the house for about a 1/2 hour and when I went back out, it was gone. While I was watching there were 2 different planes in the vicinity flying by. The light was extremely bright and never changed throughout my observation. To the necked eye it looked li! ke a very big star with no pulsing at all. The pulsing (and some of the people that have viewed the video said it looked like it was rotating also), was not apparent until it was enlarged.\"},\"location\":\"\\\"{\\\"type\\\":\\\"Point\\\",\\\"coordinates\\\":[-112.04946,33.53538055]}\\\"\"}},\n" +
"{\"sighting\":{\"occurred\":20200620,\"reported\":20200711,\"posted\":20200723,\"report\":{\"city\":\"Stevensville\",\"duration\":\"5 minutes+\",\"shape\":[\"circle\"],\"state\":\"PA\",\"summary\":\"light appearing from same "spot",traveling a str8 line across the sky,then dissappear into another "spot" of the sk small round objects appearing in the same spot of sky,one at a time and in the same exact spot.one would appear in this \\\"spot\\\" in the sky,then travel from left to right across the sky in a slow straight line.then completely dissappear into a similar \\\"spot\\\" in what appeared to be about a 10 mile length across the sky. clear summer night sky could see the stars perfect.3 other friends witnessed with me the same sightings. too far to pick up on cell phone footage.\"}}},\n" +
"{\"sighting\":{\"occurred\":20200819,\"reported\":20200820,\"posted\":20200827,\"report\":{\"city\":\"Abiquiu\",\"duration\":\"3 hours\",\"shape\":[\"light\"],\"state\":\"NM\",\"summary\":\"2 witnesses spotted several orange and white lights bobbing and flying erratically in the sky. Specifically 2 orange flying objects that appeared slightly larger than the stars around it were clearly not stationary stars as they were moving in all directions and return to a centra area. Cascading from the orange objects there were several smaller formations that seemed to be ejecting from the primary orange object.\"}}},\n" +
"{\"sighting\":{\"occurred\":20200418,\"reported\":20200418,\"posted\":20200625,\"report\":{\"city\":\"Cassopolis\",\"duration\":\"2 hours\",\"shape\":[\"light\"],\"state\":\"MI\",\"summary\":\"Bright object with two lights hovering above tree line One red and one white light together above the tree line with a bright center larger than a star but smaller than the moon. Every time I tried to take a picture the lights would go out and then come back on. I even tried to take a video but it moved and the lights went out. It just stayed there hovering for a very long time otherwise. It seemed as if it was far away and at least the size of a plane.\"},\"location\":\"\\\"{\\\"type\\\":\\\"Point\\\",\\\"coordinates\\\":[-86.0015,41.898]}\\\"\"}}]";
// Convert string to Java Map
List<Map<String, Object>> sightingMaps = new Gson().fromJson(sightings, new TypeToken<ArrayList<HashMap<String, Object>>>(){}.getType());
// Write records
for (int i = 0; i < keyIds.size(); i++) {
client.put(null, keys[i], new Bin(binName, Value.get(sightingMaps.get(i))));
}
// Batch read records
Record[] records = client.get(null, keys, binName);
for (Record record : records) {
System.out.printf("Record: %s%n", new Gson().toJson(record.getValue(binName)));
}
Batch read recordsโ
// Batch read records
Record[] records = client.get(null, keys, binName);
for (Record record : records) {
System.out.printf("Record: %s%n", new Gson().toJson(record.getValue(binName)));
}
Batch read operationโ
// Batch read operation
Record[] batchRecords = client.get(null, keys,
MapOperation.size(binName),
MapOperation.getByIndex(binName, -1, MapReturnType.VALUE)
);
// Display batch read records
for (int i = 0; i < batchRecords.length; i++) {
Record batchRecord = records[i];
Map<?, ?> results = batchRecord.getMap(binName);
Map<?, ?> recordMap = (Map<?, ?>) results.get(binName);
final int index = i;
recordMap.entrySet().stream().forEach(entry ->
System.out.printf("Batch record key id: %s %nrecord key: %s %nvalue: %s%n",
keyIds.get(index).intValue(), entry.getKey(), new Gson().toJson(entry.getValue())));
}
Batch read/write operationsโ
// Execute batch read/write operation
BatchResults batchRecordsWithOperations = client.operate(null, null, keys,
MapOperation.putItems(MapPolicy.Default, binName, Map.of(Value.get("zip"), Value.get(92011)), CTX.mapKey(Value.get(binName)), CTX.mapKey(Value.get("report"))),
MapOperation.size(binName),
MapOperation.getByIndex(binName, -1, MapReturnType.VALUE)
);
// Display batch read/write operation results
for (int i = 0; i < batchRecordsWithOperations.records.length; i++) {
BatchRecord batchRecord = batchRecordsWithOperations.records[i];
Record record = batchRecord.record;
if (record != null) {
List<?> results = record.getList(binName);
long size = (long) results.get(1);
Map<?, ?> recordMap = (Map<?, ?>) results.get(2);
final int index = i;
recordMap.entrySet().stream().forEach(entry ->
System.out.printf("Batch record key id: %s %nrecord key: %s %nvalue: %s %nsize: %d%n",
keyIds.get(index).intValue(), entry.getKey(), new Gson().toJson(entry.getValue()), size));
}
}
Batch exists operationโ
// Execute batch exists operation
boolean[] keyExists = client.exists(null, keys);
// Display exists operation results
for (int i = 0; i < keyExists.length; i++) {
Key key = keys[i];
boolean exists = keyExists[i];
System.out.printf("Record with key: %s, in namespace: %s and set: %s exists: %s%n", key.userKey, key.namespace, key.setName, exists);
}
Batch delete operationโ
// Execute batch exists operation
boolean[] keyExists = client.exists(null, keys);
// Display exists operation results
for (int i = 0; i < keyExists.length; i++) {
Key key = keys[i];
boolean exists = keyExists[i];
System.out.printf("Record with key: %s, in namespace: %s and set: %s exists: %s%n", key.userKey, key.namespace, key.setName, exists);
}
Batch read/write mixed operationโ
// Write records in the format
// 'occurred', 'reported', 'posted', 'report'
for (int i = 0; i < keyIds.size(); i++) {
Map<String, Object> sighting = sightingMap.get(i).get(binNames[0]);
double occured = (double)sighting.get(binNames[1]);
double reported = (double)sighting.get(binNames[2]);
double posted = (double)sighting.get(binNames[3]);
Map<String, Object> report = (Map<String, Object>)sighting.get(binNames[4]);
Bin[] bins = {
new Bin(binNames[1], Value.get(occured)),
new Bin(binNames[2], Value.get(reported)),
new Bin(binNames[3], Value.get(posted)),
new Bin(binNames[4], Value.get(report))
};
client.put(null, keys[i], bins);
}
// Batch read records
Record[] records = client.get(null, keys, binNames);
for (Record record : records) {
double occurred = (double)record.getValue(binNames[1]);
double reported = (double) record.getValue(binNames[2]);
double posted = (double) record.getValue(binNames[3]);
Map<String, Object> report = (Map<String, Object>) record.getValue(binNames[4]);
System.out.printf("occurred: %.0f, reported: %.0f, posted: %.0f, report: %s%n", occurred, reported, posted, report);
}
// Expressions to create bins for `timeToReport` and `timeToPosted`
Expression timeToReportWriteExp = Exp.build(Exp.sub(Exp.floatBin(binNames[2]), Exp.floatBin(binNames[1])));
Expression timeToPostedReadExp = Exp.build(Exp.sub(Exp.floatBin(binNames[3]), Exp.floatBin(binNames[1])));
// Operation to write to bin `timeToReport` and read from the bin `timeToReport`
Operation[] timeToReportOperation = Operation.array(
ExpOperation.write(binNames[6], timeToReportWriteExp, ExpWriteFlags.DEFAULT),
Operation.get(binNames[6])
);
// Operation to write to bin `timeToPosted` and read from the bin `timeToPosted`
Operation[] timeToPosted = Operation.array(
ExpOperation.write(binNames[5], timeToPostedReadExp, ExpWriteFlags.DEFAULT),
Operation.get(binNames[5])
);
// Operation to read sighting duration from Map using MapOperation
Operation[] duration = Operation.array(
MapOperation.getByKey(binNames[4], Value.get("duration"), MapReturnType.VALUE)
);
// Adding operations to batch operations to be executed
List<BatchRecord> batchOperations = new ArrayList<>();
for (Key key : keys) {
batchOperations.add(new BatchWrite(key, timeToReportOperation));
batchOperations.add(new BatchWrite(key, timeToPosted));
batchOperations.add(new BatchRead(key, duration));
}
// Executing batched operations
boolean batchResult = client.operate(null, batchOperations);
// Will be true if all operations/sub-commands are successful
if (batchResult) {
for (BatchRecord entry : batchOperations) {
Record record = entry.record;
if (entry.resultCode != ResultCode.OK) {
System.out.printf("Should not be able to get to this line since we are checking `batchResult` is true%n");
break;
}
// Show values for `timeToReport`, `timeToPosted`, and `duration`
if (record != null && record.bins != null) {
if (record.bins.containsKey(binNames[6])) {
List<?> timeToReportValue = record.getList(binNames[6]).stream().filter(Objects::nonNull).toList();
if (timeToReportValue != null) {
System.out.printf("Sighting time to report: %.0f%n", timeToReportValue.getFirst());
}
}
else if (record.bins.containsKey(binNames[5])) {
List<?> timeToPostedValue = record.getList(binNames[5]).stream().filter(Objects::nonNull).toList();
if (timeToPostedValue != null) {
System.out.printf("Sighting time to posted: %.0f%n", timeToPostedValue.getFirst());
}
}
else if (record.bins.containsKey(binNames[4])) {
String durationValue = (String) record.getValue(binNames[4]);
System.out.printf("Sighting duration: %s%n", durationValue);
}
}
}
}
Log examplesโ
A batch information coming from batch-sub. Stats on operations including batches for the namespace called a test.
Ticker log line changes:
\\ Example:
{test} batch-sub: tsvc (0,0) proxy (0,0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)
When the cluster size changes you might also see proxy events included in a batch.
\\Example:
{test} from-proxy-batch-sub: tsvc (0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)
Batch specific errorsโ
Value | Error | Description |
---|---|---|
150 | AS_ERR_BATCH_DISABLED | Batch functionality has been disabled by configuring batch-index-threads to 0 |
152 | AS_ERR_BATCH_QUEUES_FULL | All batch queues are full. Controlled by the batch-max-buffers-per-queue configuration parameter |
Refer to Error Codes.
Known Issues or Limitationsโ
Batch writes were not supported prior to the Database 6.0.
Required Client Versionsโ
For full compatibility the following clients are required:
- Java client 6.0.0 or later
- Go client 6.0.0 or later
- C client 6.0.0 or later
- C# client 5.0.0 or later
Client Referencesโ
Refer to these topics for language-specific code examples: