Blog

How a hyperscale fintech company cut infrastructure costs by 75% and read latencies by 88% with Aerospike

A hyperscale fintech company powers millions of daily transactions in India. Discover how Aerospike helped reduce its infrastructure costs by 75% and decrease read latencies by 88%.

November 19, 2025 | 7 min read
Alex Patino
Alexander Patino
Solutions Content Leader

A hyperscale fintech operating millions of daily transactions needed a data platform that could keep pace with its rapid growth across payments, rewards, and credit products. What began as a credit card bill payments platform in 2018 has grown into a broader ecosystem of over 10 million high-credit-score users, offering rewards, rent payments, short-term credit lines, and investment access.

But as new business lines grew, the company’s DynamoDB-based system began to strain under the load, slowing performance and driving up costs. This limited the hyperscale fintech company's ability to power use cases such as user experience optimization, fraud prevention, and campaign targeting in real time.

At the Aerospike Bangalore Summit, the company's Engineering Leader shared how the team rebuilt their core platforms around Aerospike, replacing fragmented data paths with a scalable foundation capable of millisecond response times. This resulted in cutting infrastructure costs and read latency by 75% and 88%, respectively. 

Why the fintech company needed a new data foundation

The hyperscale fintech company's architecture was originally built on Amazon DynamoDB, with DynamoDB Accelerator (DAX) to cache frequent reads in memory and S3 for long-term storage. While that setup worked well in the company’s early growth stages, it began to show strain as traffic surged and new workloads emerged:

  • High-concurrency reads: Personalization and campaign systems needed rapid lookups across millions of user attributes. Even with the DAX cache in place, performance variability under heavy load made it hard to meet latency targets.

  • Duplicate state across clusters: Maintaining separate DynamoDB instances for different products led to data fragmentation and consistency issues.

  • Escalating network costs: Shadow clusters used for fault tolerance generated heavy cross-availability zone (AZ) traffic and inflated egress bills.

  • Limited stream integration: The hyperscale fintech company's personalization and fraud systems relied heavily on Apache Flink for stream processing, but long window joins between event streams and stored user data introduced latency and unpredictable compute costs. This made it difficult to perform real-time campaign optimization or experimentation at scale.

As their Engineering Leader put it: “One of the challenges that we faced in this particular design was that our write latencies were quite high, and the cost of running this platform was coming majorly from the data store that we were using.” To evolve beyond incremental optimizations, the company needed a data platform that could deliver predictable sub-millisecond reads and writes, horizontal scalability, and cost efficiency without compromising on uptime or accuracy.

Five signs you've outgrown DynamoDB

Discover DynamoDB's hidden costs and constraints — from rising expenses to performance limits. Learn how modern databases can boost efficiency and cut costs by up to 80% in our white paper.

A zero-downtime migration to Aerospike

Migrating a live financial platform comes with one rule: no downtime. “The migration from DynamoDB to Aerospike was actually very crucial because this is a Tier-0 (‘T-zero’) system in our organization. If this platform goes down, the app stops working,” the Engineering Leader said. 

The engineering team began by mapping every DynamoDB table and workload to Aerospike namespaces, running both in parallel to validate data accuracy and latency performance. They also used “read repairs,” automatically updating Aerospike whenever older entities were queried, so lagging records could catch up in real time. Each rollout was staged, first reads, then writes, with clear standard operating procedures and rollback plans to ensure the app never experienced downtime.

There were also some key design principles that guided the migration:

  • Predictable scale and performance: In DynamoDB, partition growth and throughput rebalancing can introduce latency variability during sudden traffic surges. By contrast, Aerospike’s shared-nothing architecture maintains consistent performance as data and load increase. It scales linearly across nodes with full operator control, allowing the company to handle campaign peaks and product launches with predictable response times and cost efficiency.

  • Local-read topology: Today, most reads are served from the nearest node, while asynchronous replication ensures durability across AZs. The result is a 50% lower network and infrastructure spend, along with more predictable latency for campaigns and credit products that rely on live eligibility checks.

  • In-place schema evolution: Aerospike’s flexible data model allowed new attributes, such as user risk scores or behavioral segments, to be added without full table rebuilds, reducing rollout times for new product features.

Powering real-time personalization and campaigns

By the time the migration was completed, Aerospike was serving hundreds of thousands of requests per second with sub-millisecond read times, all without a single service outage. This lower read latency had a cascading effect on the company's experimentation and personalization engines, which rely on live user attributes. 

The Engineering Leader noted that because user-level reads were now near-instant, the experimentation platform could safely run more concurrent tests without slowing the app. Even campaign creation became faster. For example, growth teams could iterate multiple offers in parallel, confident that the underlying data would update consistently across systems.

From batch updates to live user attributes

Before the migration, campaign operations relied on batch calculations to estimate audience reach. As the Engineering Leader explained, campaign managers often created complex targeting expressions without knowing how many users would actually qualify until hours later, sometimes discovering they had only reached “a thousand users” despite having far larger budgets.

By representing each segment as a bitmap, a compact binary list of user IDs, the team could run set operations (union, intersection, exclusion) in memory with near-instant results with Aerospike. “Now, whenever a campaign is created, someone can select segments and get the reach calculation within a second,” the Engineering Leader said. “That solves a lot of misconfiguration and gives a better understanding of what the user is trying to target.”

To make the system production-ready, the data platform team automated how segments were refreshed. A batch pipeline in Databricks generates bitmap files for each segment and publishes their locations to Kafka. A downstream service then consumes those Kafka messages and writes the updates into Aerospike as bitmap data. 

Because each bitmap for roughly 50 million users is only about six megabytes, the update process is fast and cost-efficient. The result is a continuously updated audience store that supports both batch-driven and on-the-fly targeting, which the campaign teams now rely on for every product launch.

Streaming feedback for live campaign performance

Next, the team focused on improving campaign performance analytics. In the past, user engagement with promotions (such as clicks or redemptions) was processed in batch jobs, delaying insights and limiting real-time experimentation. 

“The problem with the earlier architecture was that feedback signals were running on batch; so the insight into how the content was doing was delayed,” the Engineering Leader explained. “We wanted to complement our batch jobs with a real-time counterpart that could do this much faster.”

By integrating Aerospike with Apache Flink and Kafka, the company made it possible to join serving and interaction events in real time. With this architecture, content and campaign performance can now be evaluated continuously, giving growth teams a live view of user engagement and conversion across the app.

Aerospike vs. DynamoDB: See the benchmark results

DynamoDB struggles to maintain performance at scale, and its pricing only worsens as you grow. If your applications demand predictable low latency, high throughput, and operational affordability, Aerospike is the better choice. The results are clear: Aerospike outperforms on every front, including latency, throughput, and total cost of ownership, at every scale.

Outcomes at scale

The hyperscale fintech company's shift to Aerospike turned a complex, high-cost data infrastructure into a unified real-time platform that now serves multiple product lines with predictable performance and dramatically lower overhead:

  • 75% lower infrastructure costs after removing shadow clusters and reducing cross-AZ egress

  • 88% faster read performance from localized reads and Aerospike’s shared-nothing architecture

  • Sub-second campaign reach calculation using bitmaps and stream joins, replacing batch jobs that once took minutes

  • Consistent millisecond-level latency across credit, payments, and rewards workloads, even during peak event windows

Lessons from a fintech company's experience

The hyperscale fintech company's engineering journey shows how simplifying infrastructure can amplify intelligence across the business. Their experience offers some clear takeaways:

  • Plan for reliability first: The company's “T-zero” rule forced rigor at every layer of the migration plan. Treating reliability as a first-class constraint keeps engineering decisions grounded and makes innovation safer to ship.

  • Simplify early to scale faster: DynamoDB and DAX had reached their limits because of accumulated complexity from multiple clusters, cache dependencies, and rising cross-AZ costs. Moving to a unified Aerospike architecture restored predictability and gave the company direct control over performance and scaling.

  • Design for reuse across workloads: Once campaign targeting, personalization, and fraud detection shared the same real-time data layer, the payoff multiplied. Every new workload could reuse the same fast, reliable foundation instead of building another silo. Efficiency came not from adding tools, but from aligning around one source of truth.

DynamoDB Migration Guide

DynamoDB is great when you’re just getting started, but once traffic spikes, cost and complexity follow. If you’re scaling up and struggling with inconsistent throughput, slow auto-scaling, or unpredictable latencies, it’s time to look at Aerospike. This guide shows you how to make the switch cleanly and confidently.