This notebook describes the pattern where a local in-memory cache is
used in front of the Aerospike database. It shows with a simple model
how a local in-memory cache can enhance performance for specific hit
ratio and cache speed scenarios.
This notebook requires Aerospike database running on localhost and that
python and the Aerospike python client have been installed
(pip install aerospike). Visit Aerospike notebooks repo for
additional details and the docker container.
Introduction
Caches are ubiquitous. Aerospike is commonly and effectively deployed as
a cache in front of a backend database that is remote, slow, and/or
limited in throughput. This notebook illustrates use of a local cache on
the client machine that sits in front of the Aerospike database, with
specific scenarios when it can be beneficial. The pattern is applicable
for a standalone Aerospike database as well as when Aerospike itself
acts as a cache.
A cache provides faster access and improved throughput by having a copy
of the data closer to where it is processed. Cache libraries, external
caching servers, and distributed cache infrastructure are deployed for
specific caching needs and solutions. Aerospike CacheDB is designed for
fast, reliable, consistent, and cost-effective access across the
globally distributed data infrastructure.
The notebook first illustrates a local in-memory cache fronting the
Aerospike database through a simple interface, and then analyses the
performance impact of a cache in various scenarios through a simple
mathematical model.
Prerequisites
This tutorial assumes familiarity with the following topics:
Familiarity with Aerospike and API. See Hello World
This notebook requires that Aerospike database is running. [Include the
right code cell for Java or Python from the two cells below.]
!asd >&/dev/null
!pgrep -x asd >/dev/null && echo "Aerospike database is running!"|| echo "**Aerospike database is not running!**"
Output:
Aerospike database is running!
Connect to database.
We need a client connected to the database.
# import the module
from __future__ import print_function
import aerospike
# Configure the client
config = {
'hosts': [ ('127.0.0.1', 3000) ],
'policy' : {'key': aerospike.POLICY_KEY_SEND}
}
# Create a client and connect it to the cluster
try:
client = aerospike.client(config).connect()
except:
import sys
print("failed to connect to the cluster with", config['hosts'])
sys.exit(1)
print('Client successfully connected to the database.')
Output:
Client successfully connected to the database.
Populate database with test data.
The following code populates the test data in set “local_cache_tutorial”
in namespace “test”. The data consists of a 10000 records with user keys
1-10000 and bins populated with random data.
namespace ='test'
tutorial_set ='local_cache_tutorial'
max_data_size =10000
# Records are addressable via a tuple of (namespace, set, key)
We will illustrate a local cache fronting the Aerospike database. The
key benefit of a local cache stems from its proximity hence speed
advantage: Aerospike provides average access time of a millisecond or
less for 99%+ requests, while local memory cache access can be in
microsecond range. Even so, as we will see, limitations of memory size
need to be taken into account as do cost consideration since a local
cache needs to be implemented at each client host.
First, let us define a simple interface for the local cache.
Cache Interface
The cache consists of cache entries with the following key operations:
get(key)
Get from cache if available and TTL is not expired, else
retrieve from the database and add to the cache.
update(key, data)
Update the database by replacing the current record, and also
add or replace the cache entry.
add(key, entry)
If the cache is full, evict appropriate entry and add the cache
entry.
Cache Eviction
A local cache can be implemented with TTL based garbage collection.
Another alternative is to maintain a Least-Recently-Used (LRU) list and
remove LRU entry when the cache is full to make room for a new entry.
LRU can be implemented by maintaining a doubly linked list of cache
entries - below, the classes CacheEntry and LocalCache allude to LRU
links, but the implementation is not provided here.
A simplistic eviction scheme below for illustrative purpose selects an
arbitrary entry for eviction, using dict.popitem(), although better
randomized eviction is possible with implementations like
randomdict.
Working with Aerospike TTL
Aerospike has TTL based eviction that allows for:
No eviction or the record is permanent,
At record creation a specific TTL is set,
TTL may be updated any time.
Since the TTL can change at origin while the record resides in the local
cache, the cache will sync up with the origin at the time of caching.
This is no different from if the record is deleted at the origin.
There are two key factors that in general impact cache performace.
Hit ratio H or the fraction of the requests served directly from the
cache without having to go to the origin server. Hit ratio depends
on factors like the cache size, invalidation (update) rate, and
access pattern.
Speed ratio S or how fast the cache access is as compared to the
origin server. Or equivalently, the speed gain of a cache hit over a
cache miss.
Below we run a simple test and compare execution times for “without
cache” and “with cache” scenarios over a large number of requests. (Feel
free to experiment with different values of data_size and cache_size to
adjust the hit ratio. Speed ratio for this implementation can vary and
is unknown.)
data_size =5000
cache_size =2000
print('Cache hit ratio: ','%3.2f'%(cache_size/data_size))
num_requests =10000
start_time = time.time()
for i inrange(num_requests):
user_key ='id-'+str(random.randint(1, data_size))
key = (namespace, tutorial_set, user_key)
_ = client.get(key)
time_without_cache = time.time() - start_time
print('Execution time without cache: ','%5.3fs'%time_without_cache)
start_time = time.time()
cache =LocalCache(cache_size)
for i inrange(num_requests):
user_key ='id-'+str(random.randint(1, data_size))
key = (namespace, tutorial_set, user_key)
_ = cache.get(key)
time_with_cache = time.time() - start_time
print('Execution time with cache: ','%5.3fs'%time_with_cache)
print('Speedup with cache: ','%5.1f%%'%((time_without_cache/time_with_cache -1) *100))
Output:
Cache hit ratio: 0.40
Execution time without cache: 0.637s
Execution time with cache: 0.479s
Speedup with cache: 33.0%
Effective Local Cache Scenarios
The theoretical performance boost from a cache ranges from 0 (no
improvement) when the hit ration (H) is 0 meaning all requests are
served from the orgin server, up to the speed ratio (S) when H is 1
meaning all requests are served from the cache.
When Aerospike database is used as a cache, it provides performance and
throughput gain through:
a robust hit ratio with its ability to scale to petabyte range and
mechanisms to keep the cache in sync with the origin.
a huge speed ratio with its sub-millisecond predictable. Speedup
over the origin database can range from 100 to 1000 or more.
So when can a local cache be beneficial in conjunction with Aerospike? A
local cache can be effective if the data for local caching is targeted
judiciously in order to keep the hit ratio high in spite of the local
cache’s size limitations. Thet is, targeting data that is frequently
read, infrequently updated, and performance critical. The benefit will
need to be balanced against the cost of deploying it across multiple
client machines.
Cache Performance Model
We use a simple mathematical model to examine the theoretical
performance. A simple model is used with two cache parameters, hit ratio
and speed ratio, discussed above.
Speedup Formula
The following discussion is applicable to both a local cache in front of
the Aerospike database, as well as Aerospike Cache in front of a backend
database.
For N random accesses made directly to the origin database, the time
needed is N * T, where T is the average access time for the origin
database.
With a hit ratio of H and speed ratio of S:
N*H requests are directly served from the cache each with T/S
access time. The total access time for cache served requests is
N*H*T/S.
The remaining N*(1-H) requests are served from the origin in
N*(1-H)*T time.
The total access time with a cache is the addition of the above two:
N*H*T/S + N*(1-H)*T.
The speedup in time and throughput is the ratio of total time
without a cache to the total time with a cache. That is, N*T /
[N*H*T/S + N*(1-H)*T] or 1/(1 - H + H/S).
Note, as expected, as H approaches 1 (all requests served from the
cache), the speedup approaches S, the cache speed ratio, and as H
approaches 0, the speedup approaches 0 (no speedup).
The following code implements the speedup function with parameters H and
S. These are computed below with various H and S values.
Thus, a cache can provide performance and throughput boost in both
scenarios:
A local cache fronting the Aerospike database if a high hit ratio
can be attained by targeting right data.
Aerospike Cache fronting a backend database because Aerospike
provides a large cache size (hence hit ratio) and sub-miilisecond
access time (hence high speed ratio).
Takeaways
The key takeaways are:
A local cache can augment performance and throughput of Aerospike
database. A local cache can be expensive as it has to be deployed at
each client host, but can be effective if high hit ratio is
achievable with repeat access to a small subset of data.
Aerospike provides significant performance and throughput boost over
the backend database due to its large cache size and automatic sync
mechanisms(which typically transalate into a higher hit ratio), and
a sub-millisecond access time.
Visit Aerospike notebooks repo to
run additional Aerospike notebooks. To run a different notebook,
download the notebook from the repo to your local machine, and then
click on File > Open, and select Upload.