Data model

Learn how Aerospike organizes data and how the Developer SDK maps these concepts to intuitive APIs.

Core concepts

Aerospike uses a hierarchical data model:

Cluster
└── Namespace (like a database)
    └── Set (like a table)
        └── Record (like a row)
            └── Bins (like columns)

Concept	Analogous To	Description
Namespace	Database	Top-level container, defines storage and replication
Set	Table	Logical grouping of records within a namespace
Record	Row	Individual data entry identified by a key
Bin	Column	Named field within a record

Namespaces

A namespace is the top-level data container in Aerospike. Each namespace has its own:

Storage configuration (memory, SSD, or hybrid)
Replication factor
TTL (time-to-live) defaults
Data retention policies

Common namespace patterns:

Namespace	Typical Use
`test`	Development and testing
`production`	Production application data
`cache`	Ephemeral, memory-only data
`archive`	Long-term storage on SSD

Sets

A set is a logical grouping of records within a namespace—similar to a table in relational databases. Unlike tables, sets:

Don’t require a schema definition
Can be created implicitly when you write your first record
Have no enforced structure—each record can have different bins

Java
Python

// Records in different sets within the same namespace
DataSet users = DataSet.of("app", "users");
DataSet orders = DataSet.of("app", "orders");
DataSet sessions = DataSet.of("app", "sessions");

# Records in different sets within the same namespace
users = DataSet.of("app", "users")
orders = DataSet.of("app", "orders")
sessions = DataSet.of("app", "sessions")

Records

A record is a single data entry, identified by a unique key within its set. Each record contains:

Key: The unique identifier you provide
Digest: A 20-byte hash of the key (what Aerospike actually stores)
Bins: The data fields
Metadata: Generation count, TTL, last update time

Keys and digests

When you create a record with a key like "user-123", Aerospike:

Hashes your key into a 20-byte digest
Optionally stores the original key alongside the digest (controlled by the send_key setting on the active Behavior; the Java default is digest only, the Python default keeps the user key)
Uses the digest for all lookups

Java
Python

// Your key: "user-123"
// Aerospike stores: digest (20-byte hash)
session.insert(users)
    .bins("name")
    .id("user-123").values("Alice")
    .execute();

// To also store the original key for later retrieval:
session.insert(users)
    .bins("name")
    .id("user-123").values("Alice")
    .sendKey()  // Now Aerospike stores both digest AND "user-123"
    .execute();

from aerospike_sdk import Behavior
from aerospike_sdk.policy import Settings

# Whether to store the original user key alongside the digest is a Behavior
# setting (`send_key`), not a per-call builder method. Behavior.DEFAULT already
# enables send_key=True, so the original key is stored by default.

# Default session: original key is stored (digest + "user-123")
session_with_key = cluster.create_session(Behavior.DEFAULT)
await session_with_key.insert(key=users.id("user-123")).put({"name": "Alice"}).execute()

# Opt out: derive a behavior with send_key disabled (digest only)
digest_only = Behavior.DEFAULT.derive_with_changes(
    "DIGEST_ONLY",
    all=Settings(send_key=False),
)
session_digest_only = cluster.create_session(digest_only)
await session_digest_only.insert(key=users.id("user-123")).put(
    {"name": "Alice"}
).execute()

Key types supported:

Type	Example	Notes
String	`"user-123"`	Most common, up to 1KB
Integer	`12345`	64-bit signed integer
Bytes	`byte[]` / `bytes`	Raw binary data

Generation count

Every record has a generation number that increments on each update. Use it for optimistic concurrency control:

Java
Python

import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordStream;

// Read current generation
Record record;
try (RecordStream readStream = session.query(users.id("user-1")).execute()) {
    record = readStream.getFirstRecord();
}
int currentGen = record.getGeneration();

// Update only if generation matches (optimistic locking)
session.update(users.id("user-1"))
    .bin("balance").setTo(newBalance)
    .ensureGenerationIs(currentGen)  // Fails if record was modified
    .execute();

# Read current generation
stream = await session.query(users.id("user-1")).execute()
row = await stream.first_or_raise()
record = row.record_or_raise()
stream.close()
current_gen = record.generation
new_balance = record.bins["balance"] + 100  # recompute from current state

# Update only if generation matches (optimistic locking)
await (
    session.update(users.id("user-1"))
    .bin("balance").set_to(new_balance)
    .ensure_generation_is(current_gen)
    .execute()
)

Bins

Bins are the named fields within a record—similar to columns, but with key differences:

Schema-free: Different records in the same set can have different bins
Typed per-value: The same bin name can hold different types in different records
Max 32KB name: Bin names are limited to 32KB (keep them short)

Supported data types

Type	Java	Python	Notes
String	`String`	`str`	UTF-8, up to 128KB
Integer	`long`	`int`	64-bit signed
Double	`double`	`float`	64-bit IEEE 754
Boolean	`boolean`	`bool`	Stored as integer (0/1)
Bytes	`byte[]`	`bytes`	Raw binary, up to 128KB
List	`List<?>`	`list`	Ordered, mixed types
Map	`Map<?, ?>`	`dict`	Key-value pairs
GeoJSON	`GeoJSON`	`GeoJSON`	Geographic data
Null	`null`	`None`	Removes the bin

import java.util.List;
import java.util.Map;

import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordStream;

// Different data types
session.insert(users)
    .bins("name", "age", "balance", "verified", "tags", "preferences")
    .id("user-1").values(
        "Alice",
        28,
        150.50,
        true,
        List.of("premium", "active"),
        Map.of(
            "theme", "dark",
            "notifications", true
        ))
    .execute();

// Reading typed values
Record record;
try (RecordStream stream = session.query(users.id("user-1")).execute()) {
    record = stream.getFirstRecord();
}
String name = record.getString("name");
long age = record.getLong("age");
double balance = record.getDouble("balance");
boolean verified = record.getBoolean("verified");
List<String> tags = record.getList("tags");
Map<String, Object> prefs = record.getMap("preferences");

# Different data types
await session.insert(key=users.id("user-1")).put(
    {
        "name": "Alice",
        "age": 28,
        "balance": 150.50,
        "verified": True,
        "tags": ["premium", "active"],
        "preferences": {
            "theme": "dark",
            "notifications": True,
        },
    }
).execute()

# Reading typed values
stream = await session.query(users.id("user-1")).execute()
row = await stream.first_or_raise()
record = row.record_or_raise()
stream.close()
bins = record.bins
name = bins.get("name")
age = bins.get("age")
balance = bins.get("balance")
verified = bins.get("verified")
tags = bins.get("tags")
prefs = bins.get("preferences")

Nested data

Lists and maps can contain other lists and maps, enabling complex document structures:

Java
Python

session.insert(users)
    .bins("profile")
    .id("user-1").values(Map.of(
        "name", "Alice Smith",
        "addresses", List.of(
            Map.of("type", "home", "city", "San Francisco"),
            Map.of("type", "work", "city", "Palo Alto")
        ),
        "scores", Map.of(
            "math", List.of(95, 87, 92),
            "science", List.of(88, 91, 89)
        )
    ))
    .execute();

await session.insert(key=users.id("user-1")).put(
    {
        "profile": {
            "name": "Alice Smith",
            "addresses": [
                {"type": "home", "city": "San Francisco"},
                {"type": "work", "city": "Palo Alto"},
            ],
            "scores": {
                "math": [95, 87, 92],
                "science": [88, 91, 89],
            },
        }
    }
).execute()

DataSet: the SDK abstraction

The DataSet class is the Developer SDK’s way of representing a namespace + set combination. It provides a clean API for identifying where records live:

Java
Python

// Create a DataSet reference
DataSet users = DataSet.of("app", "users");

// Use it to identify records
RecordId userId = users.id("user-123");

// All operations use the same pattern
session.insert(users).bins("name").id("user-123").values("Alice").execute();
session.query(userId).execute().close();
session.delete(userId).execute().close();
session.query(users).where("$.age > 21").execute().close();

# Create a DataSet reference
users = DataSet.of("app", "users")

# Use it to identify records
user_id = users.id("user-123")

# All operations use the same pattern
await session.insert(key=user_id).put({"name": "Alice"}).execute()
(await session.query(user_id).execute()).close()
(await session.delete(key=user_id).execute()).close()
(await session.query(users).where("$.age > 21").execute()).close()

DataSet vs RecordId

Class	Represents	Used For
`DataSet`	Namespace + Set	Queries, scans, set-level operations
`RecordId`	Namespace + Set + Key	Single-record operations (get, insert, update, delete)

Java
Python

DataSet users = DataSet.of("app", "users");  // Set reference
RecordId alice = users.id("alice");           // Record reference

// Query the whole set
session.query(users).where("$.active == true").execute();

// Operate on a specific record
session.query(alice).execute();

users = DataSet.of("app", "users")  # Set reference
alice = users.id("alice")           # Record reference

# Query the whole set
(await session.query(users).where("$.active == true").execute()).close()

# Operate on a specific record
(await session.query(alice).execute()).close()

Data modeling best practices

1. Design for access patterns

Unlike relational databases, Aerospike works best when you model data for how you’ll read it:

❌ Relational approach:
   users table + orders table + JOIN

✅ Aerospike approach:
   users set (with embedded recent_orders list)
   orders set (with denormalized user_name)

2. Use sets for entity types

Java
Python

// Good: Different sets for different entities
DataSet users = DataSet.of("app", "users");
DataSet orders = DataSet.of("app", "orders");
DataSet products = DataSet.of("app", "products");

# Good: Different sets for different entities
users = DataSet.of("app", "users")
orders = DataSet.of("app", "orders")
products = DataSet.of("app", "products")

3. Keep bin names short

Bin names are stored with every record. Use concise names:

❌ "user_email_address"     → 18 bytes per record
✅ "email"                  → 5 bytes per record

4. Consider key design

Choose keys that distribute data evenly and support your access patterns:

Pattern	Example Key	Use Case
Natural ID	`"user-12345"`	When you have a business identifier
UUID	`"550e8400-e29b..."`	When you need uniqueness without coordination
Composite	`"user-123:order-456"`	When combining entities
Time-based	`"events:2026-01-20"`	For time-series data

Next steps

Behaviors

Configure how operations execute—timeouts, retries, consistency.

Behaviors →

Create Records

Start writing data with insert and upsert.

Create Records →

Query with AEL

Search and filter your data.

AEL Queries →

Error Handling

Handle errors gracefully.

Error Handling →