Set up Aerospike
For the complete documentation index see: llms.txt
All documentation pages available in markdown.
Prerequisites
- Docker: For downloading and running Aerospike Database. This tutorial assumes you’ll use Docker Desktop.
- Apache Spark 3.5.3 with Scala 2.13: For ingesting data into Aerospike.
- JDK 17: For running Spark.
- Python 3.11: For running the Jupyter notebook.
- Jupyter: You’ll follow the tutorial in a provided Jupyter notebook, which you’ll run from a Python 3.11 virtual environment in a later step.
On macOS, install Docker, JDK 17, and Python 3.11 with Homebrew:
brew install --cask dockerbrew install openjdk@17 python@3.11Install Spark 3.5.3 with the Scala 2.13 build directly from the Apache archive:
mkdir -p ~/sparkcurl -L https://archive.apache.org/dist/spark/spark-3.5.3/spark-3.5.3-bin-hadoop3-scala2.13.tgz \ | tar -xz -C ~/sparkThis tutorial uses Spark 3.5, Scala 2.13, and JDK 17 for compatibility with the Aerospike Spark connector.
Do not use Homebrew’s apache-spark formula for this tutorial unless it resolves to Spark 3.5.x with Scala 2.13.
Before starting JupyterLab, set Java and Spark in the same terminal shell:
export JAVA_HOME="$(brew --prefix openjdk@17)/libexec/openjdk.jdk/Contents/Home"export SPARK_HOME="$HOME/spark/spark-3.5.3-bin-hadoop3-scala2.13"export PATH="$JAVA_HOME/bin:$SPARK_HOME/bin:$PATH"Run spark-submit --version to check your Spark and Scala versions.
Run java -version to check your JDK version.
Run python3.11 --version to confirm Python 3.11 is available for the notebook.
Set up Aerospike with Docker
Use Docker to run Aerospike. The Enterprise Edition Docker deployment creates a single-node database perfect for this test deployment.
-
Launch Docker Desktop so the Docker daemon is running.
-
Start Aerospike.
Terminal window docker run -d --name aerospike -p 3000-3002:3000-3002 \aerospike/aerospike-server-enterprise:latestExpected output Terminal window 97969889c3e206be751d83e0170df692064ed254d82c39269a5b1c9c8c898acdDocker prints a long container ID. Your ID will be different.
-
Check that the container is running:
Terminal window docker ps --filter name=aerospikeExpected output Terminal window CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES97969889c3e2 aerospike/aerospike-server-enterprise:latest "/usr/bin/as-tini-st…" 6 seconds ago Up 4 seconds 0.0.0.0:3000-3002->3000-3002/tcp aerospikeConfirm
STATUSisUpandNAMESincludesaerospike.
Namespaces and sets
Aerospike organizes data into namespaces and sets:
- Namespace: A top-level container similar to a database. Configured at Aerospike Database startup, namespaces define storage, replication, and retention policies.
- Set: A logical grouping within a namespace, similar to a table. Sets are created automatically when you write records.
The default Docker image creates an in-memory namespace called test.
This tutorial uses that default namespace.