Quickstart: Aerospike Graph
This quickstart walks you through:
- Setting up Aerospike Graph locally
- Loading a sample dataset
- Running Gremlin queries with the Gremlin Console
Prerequisites
Install the following dependencies before proceeding:
- Docker
- Java SE 17 for the Gremlin Console
- Python 3.10 for the sample app
If you manage multiple Java versions, temporarily change the PATH system variable so the Java SE 17 bin directory is first on the PATH.
Start Aerospike Graph in Docker
Using the preconfigured Aerospike Graph GitHub repository, we’ll quickly get up and running with a one-node Aerospike Graph Service (AGS) instance and a one-node Aerospike Database cluster.
-
Clone the Aerospike Graph repository and navigate to the new directory.
Terminal window git clone https://github.com/aerospike/aerospike-graph.gitcd aerospike-graph -
Start the Docker Compose stack.
Terminal window docker compose up -dTerminal window ✔ Network asgraph_net Created✔ Container aerospike-db Healthy✔ Container asgraph-zipkin Healthy✔ Container aerospike-graph-service Started -
Verify that Aerospike Graph Service, Zipkin (a query tracing service), and Aerospike Database are running.
Terminal window docker psExample response Terminal window 9d5b0cfab45f aerospike/aerospike-graph-service:latest "scripts/gremlin-ser…"...6ca1415981fe openzipkin/zipkin "start-zipkin"...248a05d9e903 aerospike/aerospike-server-enterprise:7.1 "/usr/bin/as-tini-st…"...The output should show three running Docker processes: Aerospike Graph Service, Zipkin, and Aerospike Database.
Fetch the air-routes dataset
The Air Routes sample dataset by Kelvin Lawrence is a well-known graph dataset featuring airlines, airports, and the routes between them. Its real-world graph structure provides an excellent environment for exploring core Gremlin query patterns.
-
Create an
air-routesdirectory at the root of the repository withedgesandverticessubdirectories.Terminal window mkdir -p air-routes/edges air-routes/vertices -
Download the Air Routes CSV files into the new directories.
Terminal window curl -L -o air-routes/edges/air-routes-latest-edges.csv https://raw.githubusercontent.com/krlawrence/graph/refs/heads/master/sample-data/air-routes-latest-edges.csvcurl -L -o air-routes/vertices/air-routes-latest-nodes.csv https://raw.githubusercontent.com/krlawrence/graph/refs/heads/master/sample-data/air-routes-latest-nodes.csv
Install the Gremlin Console
Gremlin Console is a command-line interface for issuing Gremlin traversals against your running graph service.
-
Download and extract the latest Gremlin Console from the Apache TinkerPop downloads
Terminal window curl -O https://dlcdn.apache.org/tinkerpop/3.8.0/apache-tinkerpop-gremlin-console-3.8.0-bin.zip && unzip apache-tinkerpop-gremlin-console-3.8.0-bin.zip && rm apache-tinkerpop-gremlin-console-3.8.0-bin.zip -
Start the console from the extracted directory.
Terminal window apache-tinkerpop-gremlin-console-3.8.0/bin/gremlin.sh -
Connect to the Aerospike Graph Service instance.
g = traversal().withRemote(DriverRemoteConnection.using("localhost", 8182, "g"));Example response graphtraversalsource[emptygraph[empty], standard]
Bulk load the dataset
Aerospike Graph provides a bulk loader call step that ingests vertex and edge CSV files directly from mounted storage.
-
Kick off the bulk loader from the Gremlin Console.
g.with("evaluationTimeout", 30000) \.call("aerospike.graphloader.admin.bulk-load.load") \.with("aerospike.graphloader.vertices", "/data/air-routes/vertices") \.with("aerospike.graphloader.edges", "/data/air-routes/edges").next()Example response ==>Bulk load started successfully. Use the g.call("aerospike.graphloader.admin.bulk-load.status") command to get the status of the job. -
Poll the job until it completes.
g.call("aerospike.graphloader.admin.bulk-load.status").next()Example response ==>duplicate-vertex-ids=0==>bad-edges=0==>step=done==>bad-entries=0==>complete=true==>status=success
Explore the graph
Use the Gremlin Console to experiment with Gremlin queries in AGS, explore vertices, traverse edges, and gain hands-on experience with basic graph traversal patterns to better understand the loaded data model.
-
Find the airport with code
DFW.g.V().has('code','DFW')Example response v[8] -
Count the number of airports in the graph.
g.V().hasLabel('airport').count()Example response 3504 -
Count the flights departing from
SFO.g.V().has('code','SFO').outE().count()Example response 157 -
List cities with routes longer than 4000 miles.
g.E().has("dist", P.gt(4000L)).inV().values("city").dedup()Example response FortalezaCayo Largo del SurVaraderoHolguinPuerto Plata -
Find U.S. airports reached from London Heathrow (
LHR).g.V().has('country','US').where(in('route').has('code','LHR')).values('code').toList()Example response IAHCHSORDEWRBWIPIT -
Get unique locations accessible from
SFOwithin two hops.g.V().has("code", "SFO").out().out().dedup().fold().project("totalAirportCountFromSFO", "USAirportCountFromSFO").by(__.unfold().count()).by(__.unfold().has("country", "US").count())Example response [totalAirportCountFromSFO:1904,USAirportCountFromSFO:455]
Finish up
Exit the Gremlin Console and stop the Docker services when you are finished.
-
Exit the Gremlin Console.
Terminal window :exit -
Navigate back to the
aerospike-graphroot directory. -
Shut down the three running Docker processes.
Terminal window docker compose down
Done!
You have now set up Aerospike Graph locally, explored the Python example app, loaded the Air Routes dataset, and queried it with Gremlin. Next, try visualizing your data with G.V() and learn more about the Gremlin IDE on the Visualization with G.V() page.