Aerospike Graph and AI: Query your data in natural language
Learn how MCP lowers the entry barrier to Aerospike Graph with AI-driven queries, data exploration, and schema refinement.
Aerospike Graph brings high performance and scale to graph workloads, but getting started can be challenging if you’re new to graph databases. Recent research highlights how standardized protocols like Model Context Protocol (MCP) can bridge AI with external systems, lowering integration barriers and enabling more context-aware applications. MCP helps jump-start your journey with Aerospike Graph by making it easier to explore data, run queries, and refine schemas without a steep learning curve.

How Model Context Protocol works
Anthropic released the Model Context Protocol in late 2024. It is a standardized, open-source framework that enables LLM Agents to interact directly with external systems, like Aerospike Graph, using defined tools, resources, and prompts.

The MCP framework consists of three primary components:
MCP host: An application that runs LLMs and includes an integrated MCP client, such as Claude Desktop or Cursor, which can connect to and communicate with MCP servers.
MCP client: A component within the MCP host that actively consumes MCP-exposed capabilities on behalf of the LLM, enabling the AI to call functions, access data, and utilize external resources.
MCP server: An external process that exposes specific capabilities, including:
Tools are exposed callable functions that perform actions on the tool, such as adding a vertex or creating an index.
Resources are data addressed by URI with a content type that the MCP server exposes for the Client to read. Servers can also publish roots to show browsable URI spaces.
Prompts are named, parameterized templates with input schemas that clients can customize and use.
Using MCP with Aerospike Graph
Graph databases can be powerful, but they often feel difficult to approach without specialized knowledge. MCP lowers that entry barrier by letting you use natural language to explore data, ask questions, and even refine your schema. It also provides context, such as configuration, metadata, and version details, so the AI responds with an accurate picture of your database.
In our setup:
MCP host/client: Cursor, the code editor with an integrated MCP client.
MCP server: Aerospike Graph Service MCP Server, our custom server.
Database: Aerospike Graph Service, the graph database containing our dataset. For this demo, we will use the Air Routes dataset from Kelvin Lawrence.
What our MCP server can do
Tools
read_gremlin: Ask questions about the graph without making changes
write_gremlin: Make updates to the graph, with elicitation to confirm first
profile_gremlin: See how a query performs, with timing by step
get_schema: Inspect the structure of your graph: vertex labels, edge labels, and properties
Resources
AGS prompt map: All available prompts in the server
AGS metadata: Counts of vertices, edges, and supernodes
AGS config data: Current Aerospike Server and Graph configuration details
AGS version data: Versions of Aerospike Database, Graph, and Gremlin in use
AGS index cardinality: Distinct counts for each secondary index

Demo
For this demo, we are running Docker images for the Aerospike Server, Graph, and MCP server. We have also loaded our dataset into our Aerospike Graph instance. Now we can step through examples of how natural language queries translate into meaningful graph insights.
Graph database-focused questions
We begin by asking about the current configuration and version of the database.
Q: Using the MCP server, what is the Configuration, and versions of my current database?
A:
Based on the MCP server data, here’s your current database configuration and version information:
🔢 Version Information
• Aerospike Database: 8.0.0.7
• Aerospike Graph Service: 3.0.0
• Gremlin: 3.7.3
⚙️ Configuration Details
Graph Properties
• Namespace: test
• Graph Implementation: com.aerospike.firefly.structure.FireflyGraph
• Host: aerospike-db
Gremlin Server Configuration
• Host: 0.0.0.0
• Port: 8182
• Evaluation Timeout: 10,000ms (10 seconds)
• Channelizer: WebSocket
• Thread Pool Worker: 10
• Gremlin Pool: 80
• Result Iteration Batch Size: 64
Performance and Memory Settings
• Max Content Length: 10,485,760 bytes (~10MB)
• Write Buffer Low Water Mark: 32,768 bytes
• Write Buffer High Water Mark: 65,536 bytes
• Max Accumulation Buffer Components: 1,024
Security and SSL
• SSL: Disabled
• Strict Transaction Management: Disabled
Monitoring and Metrics
• Console Reporter: Enabled (180-second intervals)
• CSV Reporter: Disabled
• JMX Reporter: Disabled
• SLF4J Reporter: Disabled
Serializers
• GraphSON v3 with TinkerGraph IO Registry
• GraphBinary v1.0 (standard and string variants)
With configuration details established, the next step is to understand the structure of the data.
Q: Whats the schema of the database?
A:
📑 Database Schema
Vertex Labels (with counts)
• airport: 3,504 vertices
• country: 237 vertices
• continent: 7 vertices
• version: 1 vertex
Edge Labels (with counts)
• contains: 7,008 edges
• route: 50,637 edges
Vertex Properties
• country – Country information
• code – Airport/country codes
• longest – Longest runway length
• city – City name
• elev – Elevation
• icao – ICAO airport code
• lon – Longitude coordinate
• type – Type classification
• region – Regional information
• runways – Number of runways
• lat – Latitude coordinate
• desc – Description
• date – Date information
• author – Author/creator information
Edge Properties
• dist – Distance (likely for routes between airports)
Summary
• 🌍 Geographic hierarchy: Continents → Countries → Airports
• ✈️ Airport details: coordinates, elevation, runways, ICAO codes
• 🛫 Route network: 50,637 flight routes with distance information
• 🔗 Containment: 7,008 “contains” edges (continent-to-country, country-to-airport)
This schema confirms that the dataset is a detailed aviation network with rich attributes and relationships.
Real-world questions
Once the basics of configuration and schema are clear, we can turn to more practical, real-world questions.
Q: Im looking to travel to Japan from Victoria BC, what are the 3 best routes to get there?
A:
Based on the aviation database, here are the 3 best routes from Victoria BC (YYJ) to Japan:
🥇 Victoria → Vancouver → Tokyo Narita (BEST)
• YYJ → YVR: 39 nautical miles
• YVR → NRT: 4,659 nautical miles
• Total Distance: 4,698 nautical miles
• Why it’s best: shortest total distance, Vancouver is close to Victoria, Narita is Tokyo’s main hub
🥈 Victoria → Vancouver → Tokyo Haneda
• YYJ → YVR: 39 nautical miles
• YVR → HND: 4,693 nautical miles
• Total Distance: 4,732 nautical miles
• Why it’s good: Haneda is closer to downtown Tokyo than Narita
🥉 Victoria → Seattle → Tokyo Narita
• YYJ → SEA: 97 nautical miles
• SEA → NRT: 4,754 nautical miles
• Total Distance: 4,851 nautical miles
• Why it’s third: slightly longer, but Seattle has strong international connections
Alternative routes to consider:
• YYJ → YVR → KIX (Osaka): 4,965 nm
• YYJ → YVR → NGO (Nagoya): 4,883 nm
• YYJ → YYC → NRT (Calgary): 5,385 nm
✅ Recommendation:
The Vancouver → Tokyo Narita route is your best option: shortest distance, simple connection, and plenty of flights.s best: shortest total distance, Vancouver is close to Victoria, Narita is Tokyo’s main hub
Next, let’s step back and look at the bigger picture: which airports are the busiest overall?
Q: What are the 10 Busiest airports?
A:
🌍 Top 10 Busiest Airports by Route Connectivity
1. FRA — Frankfurt, Germany (620 routes)
2. IST — Istanbul, Turkey (618 routes)
3. CDG — Paris, France (587 routes)
4. AMS — Amsterdam, Netherlands (568 routes)
5. MUC — Munich, Germany (541 routes)
6. ORD — Chicago, USA (529 routes)
7. DFW — Dallas/Fort Worth, USA (506 routes)
8. PEK — Beijing, China (497 routes)
9. DXB — Dubai, UAE (496 routes)
10. ATL — Atlanta, USA (484 routes)
Key insights:
• 🇪🇺 Europe dominates with 4 airports, including 2 in Germany.
• 🌍 Airports span 6 continents.
• ✈️ All are major hubs used for transfers.
• 💡 Frankfurt leads with 620 routes, showing its global importance.
This list raises an obvious question: Shouldn’t Beijing be ranked higher?
Q: Not Beijing?
A:
Here’s what’s happening with Beijing:
Beijing Airport Situation
• PEK — Beijing Capital: 497 routes (#8 on the list)
• PKX — Beijing Daxing: 113 routes
• NAY — Beijing Nanyuan: 84 routes
Why it feels low:
• 📊 Passenger volume vs. routes: Beijing is one of the busiest in reality, but this dataset counts routes.
• 📅 Data age: Daxing opened in 2019, and growth may not be fully reflected.
• 🛫 Domestic vs. international: Beijing handles massive domestic traffic that may not show here.
From analyzing existing airports, we can also use MCP to model entirely new ones.
Q: I want to start my own airport and add it to the database, it will be called AerospikeVI, located in Duncan, BC, and have direct routes to Nanaimo, Victoria, Vancouver, and Haneda Japan
A:
🎉 AerospikeVI International Airport Successfully Created!
Airport Details
• Code: AVI
• ICAO: CAVI
• Name: AerospikeVI International Airport
• Location: Duncan, BC, Canada
• Coordinates: 48.7787°N, 123.7067°W
• Elevation: 200 feet
• Runways: 2 (longest: 8,000 feet)
• Region: CA-BC
Direct Routes Added
• AVI ↔ YCD (Nanaimo): 25 nm
• AVI ↔ YYJ (Victoria): 35 nm
• AVI ↔ YVR (Vancouver): 45 nm
• AVI ↔ HND (Tokyo Haneda): 4,720 nm
Route Summary
• Total: 8 routes (4 outbound + 4 inbound)
• Domestic: 3
• International: 1 (Japan)
✨ Fun fact: Your new airport now has a direct Duncan → Tokyo Haneda route, just like Vancouver International.
With AerospikeVI in place, we can now query new routes.
Q: Now from AVI, find the quickest route to Mumbai
A:
🥇 Quickest route
• AVI → YVR: 45 nm
• YVR → DEL: 6,916 nm
• DEL → BOM: 707 nm
• Total: 7,668 nm
• Connections: Vancouver, Delhi
🥈 Alternative route 1
• AVI → YVR → LHR → BOM
• Total: 9,230 nm
🥉 Alternative route 2
• AVI → YVR → FRA → BOM
• Total: 9,131 nm
🌏 Via Asia route
• AVI → HND → DXB → BOM
• Total: 10,843 nm
✅ Recommendation:
The AVI → Vancouver → Delhi → Mumbai path is the fastest at 7,668 nm and uses strong existing connections.
Developer-oriented questions
Beyond travel planning, MCP also provides value for developers who want to analyze and optimize their data model.
Q: Is there any way I can improve the efficiency of our data model/schema here?
A:
🚀 Schema Optimization Recommendations
1. Indexing Strategy
• Add indexes on: airport.code, airport.country, airport.city, route.dist
2. Data Normalization
• Current: redundant strings, missing country vertices, inconsistent coordinates
• Fix: normalize countries into vertices, standardize coordinates
3. Property Optimization
• Drop unused (author, date) and redundant (type) fields
• Fix malformed ICAO codes
4. Route Bidirectionality
• Current: ~50k routes stored twice
• Fix: undirected edges or direction logic
5. Geographic Hierarchy
• Add explicit continent → country → airport structure
6. Distance Optimization
• Add distance categories for faster filtering
7. Query Performance
• Code lookups: ~8ms (good)
• City lookups: ~9ms (needs index)
• Route traversals: ~2ms (acceptable)
🎯 Priorities
• High: add indexes, remove duplicates, drop redundant type
• Medium: normalize countries, add distance categories
• Low: build hierarchy, add temporal and capacity data
📊 Expected Gains
• 50% storage reduction
• 3–5× faster country queries
• 2–3× faster city searches
One insight that stands out is the handling of bidirectionality. Because every airport-to-airport route implies a return route, the dataset is doubling edges that could be represented more efficiently.
Finally, MCP can also be used to examine query performance in detail.
Q: How expensive is my query? g.V().hasLabel("airport").out().valueMap()
A:
⚡ Query Performance Analysis
• Total execution time: 321.09 ms
• Total processing time: 642.19 ms (profiling overhead included)
📊 Performance Breakdown
• FireflyBatchVertexReadStep — 188.84 ms (59%)
• PropertyMapStep — 102.41 ms (32%)
• FireflyGraphStep — 25.54 ms (8%)
• FireflyScanProfileStep — 4.30 ms (1%)
🔍 Why expensive?
• Traverses all outgoing edges from all airports
• No edge filter: includes both route and contains
• Loads all properties for every result
📈 Scale impact
• 3,505 airports × multiple edges = massive result set
✅ Optimization Suggestions
1. Limit scope: limit(10)
2. Filter edges: outE("route")
3. Select fewer props: valueMap("code","city","country")
4. Add filters: has("country","CA")
This response even provides optimization suggestions.
Why Aerospike Graph with MCP is the future of intelligent data
Our MCP Server lowers the barrier to working with Aerospike Graph by allowing anyone to use natural language for exploration and operations. Team members across roles, including non-technical stakeholders, can ask questions and interact with the graph without needing deep database expertise. It also improves the effectiveness of vibe coding tools used in tech stacks with Aerospike Graph by giving the LLMs the ability to interact with the database directly.
For developers, MCP adds another layer of value by helping analyze queries, refine data models, and surface performance insights. Because large language models struggle to generate complex graph database queries independently, MCP ensures accurate and reliable interactions for both client applications and their users.
This makes MCP a powerful way to work with Aerospike Graph Service today, with clear opportunities to expand into areas such as subgraph visualization, bulk loading, suggested index creation, and even deploying new AGS instances on Aerospike Cloud.