Look-Aside Cache for MongoDB
For an interactive Jupyter notebook experience:
This is a sample notebook for using Aerospike as a read/look-aside cache
- This notebook demonstrates the use of Aerospike as a cache using Mongo as another primary datastore
- It is required to run Mongo as a separte container using
docker run --name some-mongo -d mongo:latest
To test: Run the get_data(key, value)
method once - to fetch from
Mongo and populate Aerospike
Another run will fetch the data from Aerospike cache
Ensure that the Aerospike Database is running
!asd >& /dev/null
!pgrep -x asd >/dev/null && echo "Aerospike database is running!" || echo "**Aerospike database is not running!**"
Output:
Aerospike database is running!
Import all dependencies
import aerospike
import pymongo
from pymongo import MongoClient
import sys
Configure the clients
The configuration is for
- Aerospike database running on port 3000 of localhost (IP 127.0.0.1) which is the default.
- Mongo running in a separate container whose IP can be found by
docker inspect <containerid> | grep -i ipaddress
Modify config if your environment is different (Aerospike database running on a different host or different port).
# Define a few constants
AEROSPIKE_HOST = "0.0.0.0"
AEROSPIKE_PORT = 3000
AEROSPIKE_NAMESPACE = "test"
AEROSPIKE_SET = "demo"
MONGO_HOST = "172.17.0.3"
MONGO_PORT = 27017
MONGO_DB = "test-database"
MONGO_COLLECTION = "test-collection"
#Aerospike configuration
aero_config = {
'hosts': [ (AEROSPIKE_HOST, AEROSPIKE_PORT) ]
}
try:
aero_client = aerospike.client(aero_config).connect()
except:
print("Failed to connect to the cluster with", aero_config['hosts'])
sys.exit(1)
print("Connected to Aerospike")
#Mongo configuration
try:
mongo_client = MongoClient(MONGO_HOST, MONGO_PORT)
print("Connected to Mongo")
except:
print("Failed to connect to Mongo")
sys.exit(1)
Output:
Connected to Aerospike
Connected to Mongo
Store data in Mongo and clear the keys in Aerospike if any
db = mongo_client[MONGO_DB]
collection = db[MONGO_COLLECTION]
def store_data(data_id, data):
m_data = {data_id: data}
collection.drop()
aero_key = ('test', 'demo', data_id)
#aero_client.remove(aero_key)
post_id = collection.insert_one(m_data)
store_data("key", "value")
Fetch the data. In this instance we are using a simple key value pair.
If the data exists in the cache it is returned, if not data is read from Mongo, put in the cache and then returned
def get_data(data_id, data):
aero_key = (AEROSPIKE_NAMESPACE, AEROSPIKE_SET, data_id)
#aero_client.remove(aero_key)
data_check = aero_client.exists(aero_key)
if data_check[1]:
(key, metadata, record) = aero_client.get(aero_key)
print("Data retrieved from Aerospike cache")
print("Record::: {} {}".format(data_id, record['value']))
else:
mongo_data = collection.find_one({data_id: data})
print("Data not present in Aerospike cache, retrieved from mongo {}".format(mongo_data))
aero_client.put(aero_key, {'value': mongo_data[data_id]})
get_data("key", "value")
Output:
Data retrieved from Aerospike cache
Record::: key value