Skip to content

Data model

Learn how Aerospike organizes data and how the Developer SDK maps these concepts to intuitive APIs.

Core concepts

Aerospike uses a hierarchical data model:

Cluster
└── Namespace (like a database)
└── Set (like a table)
└── Record (like a row)
└── Bins (like columns)
ConceptAnalogous ToDescription
NamespaceDatabaseTop-level container, defines storage and replication
SetTableLogical grouping of records within a namespace
RecordRowIndividual data entry identified by a key
BinColumnNamed field within a record

Namespaces

A namespace is the top-level data container in Aerospike. Each namespace has its own:

  • Storage configuration (memory, SSD, or hybrid)
  • Replication factor
  • TTL (time-to-live) defaults
  • Data retention policies

Common namespace patterns:

NamespaceTypical Use
testDevelopment and testing
productionProduction application data
cacheEphemeral, memory-only data
archiveLong-term storage on SSD

Sets

A set is a logical grouping of records within a namespace—similar to a table in relational databases. Unlike tables, sets:

  • Don’t require a schema definition
  • Can be created implicitly when you write your first record
  • Have no enforced structure—each record can have different bins
// Records in different sets within the same namespace
DataSet users = DataSet.of("app", "users");
DataSet orders = DataSet.of("app", "orders");
DataSet sessions = DataSet.of("app", "sessions");

Records

A record is a single data entry, identified by a unique key within its set. Each record contains:

  • Key: The unique identifier you provide
  • Digest: A 20-byte hash of the key (what Aerospike actually stores)
  • Bins: The data fields
  • Metadata: Generation count, TTL, last update time

Keys and digests

When you create a record with a key like "user-123", Aerospike:

  1. Hashes your key into a 20-byte digest
  2. Optionally stores the original key alongside the digest (controlled by the send_key setting on the active Behavior; the Java default is digest only, the Python default keeps the user key)
  3. Uses the digest for all lookups
// Your key: "user-123"
// Aerospike stores: digest (20-byte hash)
session.insert(users)
.bins("name")
.id("user-123").values("Alice")
.execute();
// To also store the original key for later retrieval:
session.insert(users)
.bins("name")
.id("user-123").values("Alice")
.sendKey() // Now Aerospike stores both digest AND "user-123"
.execute();

Key types supported:

TypeExampleNotes
String"user-123"Most common, up to 1KB
Integer1234564-bit signed integer
Bytesbyte[] / bytesRaw binary data

Generation count

Every record has a generation number that increments on each update. Use it for optimistic concurrency control:

import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordStream;
// Read current generation
Record record;
try (RecordStream readStream = session.query(users.id("user-1")).execute()) {
record = readStream.getFirstRecord();
}
int currentGen = record.getGeneration();
// Update only if generation matches (optimistic locking)
session.update(users.id("user-1"))
.bin("balance").setTo(newBalance)
.ensureGenerationIs(currentGen) // Fails if record was modified
.execute();

Bins

Bins are the named fields within a record—similar to columns, but with key differences:

  • Schema-free: Different records in the same set can have different bins
  • Typed per-value: The same bin name can hold different types in different records
  • Max 32KB name: Bin names are limited to 32KB (keep them short)

Supported data types

TypeJavaPythonNotes
StringStringstrUTF-8, up to 128KB
Integerlongint64-bit signed
Doubledoublefloat64-bit IEEE 754
BooleanbooleanboolStored as integer (0/1)
Bytesbyte[]bytesRaw binary, up to 128KB
ListList<?>listOrdered, mixed types
MapMap<?, ?>dictKey-value pairs
GeoJSONGeoJSONGeoJSONGeographic data
NullnullNoneRemoves the bin

Working with bins

import java.util.List;
import java.util.Map;
import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordStream;
// Different data types
session.insert(users)
.bins("name", "age", "balance", "verified", "tags", "preferences")
.id("user-1").values(
"Alice",
28,
150.50,
true,
List.of("premium", "active"),
Map.of(
"theme", "dark",
"notifications", true
))
.execute();
// Reading typed values
Record record;
try (RecordStream stream = session.query(users.id("user-1")).execute()) {
record = stream.getFirstRecord();
}
String name = record.getString("name");
long age = record.getLong("age");
double balance = record.getDouble("balance");
boolean verified = record.getBoolean("verified");
List<String> tags = record.getList("tags");
Map<String, Object> prefs = record.getMap("preferences");

Nested data

Lists and maps can contain other lists and maps, enabling complex document structures:

session.insert(users)
.bins("profile")
.id("user-1").values(Map.of(
"name", "Alice Smith",
"addresses", List.of(
Map.of("type", "home", "city", "San Francisco"),
Map.of("type", "work", "city", "Palo Alto")
),
"scores", Map.of(
"math", List.of(95, 87, 92),
"science", List.of(88, 91, 89)
)
))
.execute();

DataSet: the SDK abstraction

The DataSet class is the Developer SDK’s way of representing a namespace + set combination. It provides a clean API for identifying where records live:

// Create a DataSet reference
DataSet users = DataSet.of("app", "users");
// Use it to identify records
RecordId userId = users.id("user-123");
// All operations use the same pattern
session.insert(users).bins("name").id("user-123").values("Alice").execute();
session.query(userId).execute().close();
session.delete(userId).execute().close();
session.query(users).where("$.age > 21").execute().close();

DataSet vs RecordId

ClassRepresentsUsed For
DataSetNamespace + SetQueries, scans, set-level operations
RecordIdNamespace + Set + KeySingle-record operations (get, insert, update, delete)
DataSet users = DataSet.of("app", "users"); // Set reference
RecordId alice = users.id("alice"); // Record reference
// Query the whole set
session.query(users).where("$.active == true").execute();
// Operate on a specific record
session.query(alice).execute();

Data modeling best practices

1. Design for access patterns

Unlike relational databases, Aerospike works best when you model data for how you’ll read it:

❌ Relational approach:
users table + orders table + JOIN
✅ Aerospike approach:
users set (with embedded recent_orders list)
orders set (with denormalized user_name)

2. Use sets for entity types

// Good: Different sets for different entities
DataSet users = DataSet.of("app", "users");
DataSet orders = DataSet.of("app", "orders");
DataSet products = DataSet.of("app", "products");

3. Keep bin names short

Bin names are stored with every record. Use concise names:

❌ "user_email_address" → 18 bytes per record
✅ "email" → 5 bytes per record

4. Consider key design

Choose keys that distribute data evenly and support your access patterns:

PatternExample KeyUse Case
Natural ID"user-12345"When you have a business identifier
UUID"550e8400-e29b..."When you need uniqueness without coordination
Composite"user-123:order-456"When combining entities
Time-based"events:2026-01-20"For time-series data

Next steps

Behaviors

Configure how operations execute—timeouts, retries, consistency.

Behaviors →

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?