Basic Usage
Overviewโ
There are several options for connecting to and interacting with Aerospike Graph. Some possibilities are:
The Gremlin Console, an interactive command-line terminal for sending queries and receiving responses.
A client application. This page provides code samples for Python and Java client code.
Working with graph dataโ
You can use a client application or the Gremlin Console to query an existing data set, or add new edges and vertices to a data set. The examples below demonstrate adding new vertices one at a time to a database. To bulk load new data into a Graph database, use the Graph bulk loader.
Kelvin Lawrence has compiled and made public a data set containing information about airlines, airports around the world, and routes between them, designed for use with a graph database. The data set is large enough to be interesting and useful, but small enough to be practical for testing and experimentation purposes. To run the following examples, download one of the .graphml data files here.
To use the .graphml file, you must bind it to a Docker volume. When you start the AGS
Docker image, use the -v
option to bind the local .graphml file directory to a directory in the
Docker container. For example, if you download the data file air-routes-small.graphml
to
the directory /home/users/data
, start the AGS Docker image with the -v
option:
docker run -p 8182:8182 \
-e aerospike.client.namespace="test" \
-e aerospike.client.host="aerospike-devel-cluster-host1:3000,aerospike-devel-cluster-host2:3000" \
-v /home/user/data/:/opt/air-routes/ \
aerospike/aerospike-graph-service
You can use the Gremlin console to load the air-routes
data set into Aerospike Graph with the following
command:
g.with("evaluationTimeout", 24L * 60L * 60L * 1000L).io("/opt/air-routes/air-routes-small.graphml").with(IO.reader, IO.graphml).read()
The air-routes
data set may take a few minutes to load, depending on your
host hardware and network configuration.
Examplesโ
- Gremlin console
- Client application
- Jupyter notebook
Download the latest version of the Gremlin Console from the Apache website.
noteYou must have a Java runtime to use the Gremlin console.
After unzipping the package and navigating to the application folder, start the console with the following command:
./bin/gremlin.sh
Connect via the Aerospike Graph Service (AGS) Docker image. See the instructions if you haven't yet started an AGS Docker image.
g = traversal().withRemote(DriverRemoteConnection.using("GREMLIN_SERVER_IP_ADDRESS", 8182, "g"));
Replace
GREMLIN_SERVER_IP_ADDRESS
with the accessible IP address of your AGS Docker image.Expected output:
graphtraversalsource[emptygraph[empty], standard]
Add a new vertex with the
addV
function:g.addV('foo').property('company','aerospike').property('scale','unlimited')
Expected output:
v[-1]
Return the ID of your newly-created vertex:
g.V().has('company','aerospike')
Expected output:
v[-1]
Sample Gremlin queries with the air-routes
data setโ
Find the airport with code "DFW":
g.V().has('code','DFW')
Find the number of airports in this graph:
g.V().hasLabel('airport').count()
Find the number of flights going out of the airport with code "SFO":
g.V().has('code','SFO').outE().count()
Get all the cities with flights that are > 4000 miles:
g.E().has("dist", P.gt(4000L)).inV().values("city").dedup()
Find all the airports in the USA you can fly to from London Heathrow (LHR):
g.V().has('code','LHR').out('route').has('country','US').values('code')
Find all the unique locations in the world and in the US that I can get to from SFO through a 2 hop flight:
g.V().has("code", "SFO").out().out().dedup().fold().project("totalAirportCountFromSFO", "USAirportCountFromSFO").by(__.unfold().count()).by(__.unfold().has("country", "US").count())
To get performance metrics for a query, append .profile()
to the end of the command.
Additional resourcesโ
- Gremlin uses Groovy as its query language.
- Kelvin Lawrence has written a thorough Gremlin manual.
Pythonโ
Install Gremlin-Python, a Gremlin client library for Python:
pip3 install gremlinpython
Gremlin-Python
implements many of the functions found in Gremlin. The following example code establishes a connection with a remote Gremlin server, creates a new vertex, and reads it back. Additional queries use theair-routes
data set.noteBefore running the following example Python application, load the
air-routes
data set, as described in the Working with graph data section. If you run the application with an empty database, it returns empty values.from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.traversal import IO
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
if __name__ == '__main__':
# Create GraphTraversalSource to remote server.
g = traversal().with_remote(DriverRemoteConnection('ws://localhost:8182/gremlin', 'g'))
# Add a new vertex.
g.add_v('foo').property('company','aerospike').property('scale','unlimited').iterate()
# Read back the new vertex.
result = g.V().has('company','aerospike').element_map().to_list()
print(result)
# Sample queries with the air-routes data set. To use these queries, download
# the data set and load it with the following:
# g.with_("evaluationTimeout", 24 * 60 * 60 * 1000).\
# io(PATH_TO_DATA_FILE).\
# with_(IO.reader, IO.graphml).\
# read().iterate()
# Find the airport with code "DFW":
result = g.V().has('code','DFW').element_map().next()
print(result, '\n')
# Find the number of airports in this graph:
result = g.V().has_label('airport').count().next()
print(result, '\n')
# Find the number of flights going out of the airport with code "SFO":
result = g.V().has('code','SFO').out_e().count().next()
print(result, '\n')
# Find all the airports in the USA you can fly to from London Heathrow (LHR):
result = g.V().has('code','LHR').out('route').has('country','US').values('code').to_list()
print(result, '\n')
Javaโ
package com.aerospike.firefly.benchmark;
import org.apache.tinkerpop.gremlin.driver.Cluster;
import org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection;
import org.apache.tinkerpop.gremlin.process.traversal.IO;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__;
import java.util.Map;
import static org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource.traversal;
public class FireflySample {
// Use the command: docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' CONTAINER_ID
// to get the HOST IP address below. 172.17.0.3 is typical for Linux when Aerospike is running in Docker already
// but there may be some variance.
private static final String HOST = "172.17.0.3";
private static final int PORT = 8182;
private static final Cluster.Builder BUILDER = Cluster.build().addContactPoint(HOST).port(PORT).enableSsl(false);
public static void main(final String[] args) {
System.out.println("Creating the Cluster.");
final Cluster cluster = BUILDER.create();
System.out.println("Creating the GraphTraversalSource.");
final GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster));
// Add a new vertex.
g.addV("foo").property("company", "aerospike").property("scale","unlimited").iterate();
// Read the new vertex.
Vertex ReadVertex = g.V().has("company","aerospike").next();
System.out.println(ReadVertex);
// Find the number of airports in this graph:
long airportCount = g.V().hasLabel("airport").count().next();
System.out.printf("Airport count: %d%n", airportCount);
// Find the number of flights going out of the airport with code "SFO":
long flightCount = g.V().has("code","SFO").outE().count().next();
System.out.printf("Flight count: %d%n", flightCount);
// We can use .next() to terminate this traversal because it only returns 1 item
Map<String, Object> SFO2Hop = g.V().
has("code", "SFO").
out().out().dedup().fold().
project("totalAirportCountFromSFO", "USAirportCountFromSFO").
by(__.unfold().count()).
by(__.unfold().has("country", "US").count()).
next();
System.out.println(SFO2Hop);
// Find all the airports in the USA you can fly to from London Heathrow (LHR):
GraphTraversal LHRCount = g.V().
has("code","LHR").
out("route").
has("country","US").
values("code");
LHRCount.forEachRemaining(
e -> System.out.println(LHRCount.toList())
);
cluster.close();
}
}
Jupyter notebookโ
Jupyter Notebooks are a community standard for communicating and performing interactive computing. You can create a notebook with Aerospike Graph to provide a sandbox for experimentation purposes.
To try out a notebook with Aerospike Graph functionality, visit the Graph Jupyter notebook. The notebook uses the air routes data set.