Look-Aside Cache for MongoDB
For an interactive Jupyter notebook experience:
This is a sample notebook for using Aerospike as a read/look-aside cache
- This notebook demonstrates the use of Aerospike as a cache using Mongo as another primary datastore
- It is required to run Mongo as a separte container using
docker run --name some-mongo -d mongo:latest
To test: Run the get_data(key, value)
method once - to fetch from
Mongo and populate Aerospike
Another run will fetch the data from Aerospike cache
Ensure that the Aerospike Database is running
!asd >& /dev/null!pgrep -x asd >/dev/null && echo "Aerospike database is running!" || echo "**Aerospike database is not running!**"
Output:
Aerospike database is running!
Import all dependencies
import aerospikeimport pymongofrom pymongo import MongoClientimport sys
Configure the clients
The configuration is for
- Aerospike database running on port 3000 of localhost (IP 127.0.0.1) which is the default.
- Mongo running in a separate container whose IP can be found by
docker inspect <containerid> | grep -i ipaddress
Modify config if your environment is different (Aerospike database running on a different host or different port).
# Define a few constants
AEROSPIKE_HOST = "0.0.0.0"AEROSPIKE_PORT = 3000AEROSPIKE_NAMESPACE = "test"AEROSPIKE_SET = "demo"MONGO_HOST = "172.17.0.3"MONGO_PORT = 27017MONGO_DB = "test-database"MONGO_COLLECTION = "test-collection"
#Aerospike configurationaero_config = { 'hosts': [ (AEROSPIKE_HOST, AEROSPIKE_PORT) ]}try: aero_client = aerospike.client(aero_config).connect()except: print("Failed to connect to the cluster with", aero_config['hosts']) sys.exit(1)print("Connected to Aerospike")
#Mongo configurationtry: mongo_client = MongoClient(MONGO_HOST, MONGO_PORT) print("Connected to Mongo")except: print("Failed to connect to Mongo") sys.exit(1)
Output:
Connected to AerospikeConnected to Mongo
Store data in Mongo and clear the keys in Aerospike if any
db = mongo_client[MONGO_DB]collection = db[MONGO_COLLECTION]
def store_data(data_id, data): m_data = {data_id: data} collection.drop() aero_key = ('test', 'demo', data_id) #aero_client.remove(aero_key) post_id = collection.insert_one(m_data)store_data("key", "value")
Fetch the data. In this instance we are using a simple key value pair.
If the data exists in the cache it is returned, if not data is read from Mongo, put in the cache and then returned
def get_data(data_id, data): aero_key = (AEROSPIKE_NAMESPACE, AEROSPIKE_SET, data_id) #aero_client.remove(aero_key) data_check = aero_client.exists(aero_key) if data_check[1]: (key, metadata, record) = aero_client.get(aero_key) print("Data retrieved from Aerospike cache") print("Record::: {} {}".format(data_id, record['value'])) else: mongo_data = collection.find_one({data_id: data}) print("Data not present in Aerospike cache, retrieved from mongo {}".format(mongo_data)) aero_client.put(aero_key, {'value': mongo_data[data_id]})get_data("key", "value")
Output:
Data retrieved from Aerospike cacheRecord::: key value