Blog

How PhonePe runs real-time transactions, AI feature stores, and governance with Aerospike

At the Aerospike Bangalore Summit, PhonePe shared how it uses Aerospike to power 360 million daily transactions with real-time performance, reliability, and compliance.

November 11, 2025 | 7 min read
Alex Patino
Alexander Patino
Solutions Content Leader

As one of India’s largest digital payment platforms, PhonePe delivers instant, trusted financial transactions for hundreds of millions of people. On any given day, more than 360 million payments move through PhonePe’s network, from street vendors scanning QR codes to online merchants processing cross-border transactions.

At this scale, speed alone isn’t enough. Data must be correctly partitioned across tenants, replicated accurately across sites, and auditable by regulators. Naturally, simultaneously scaling for reliability, governance, and cost efficiency has become one of the company’s defining technical challenges.

At the Aerospike Bangalore Summit, Koushik Ramachandra, software architect at PhonePe, shared how Aerospike’s role expanded from powering payment authentication, fraud checks, and caching systems to become the backbone of its governance and compliance architecture.

What operating at a national scale demands

Every tap, scan, or transfer on PhonePe triggers dozens of concurrent database calls: verifying user identity, checking balances, updating ledgers, and issuing confirmations. Even the smallest data mismatch can ripple into millions of failed transactions and regulatory risk.

As Ramachandra put it, “We have about 650 million registered users and 4.5 million registered merchants. For any infrastructure component, we want the setup to be lean. You don’t want 100-node, 200-node clusters, which become an operational nightmare.”

To meet these stakes, PhonePe built an infrastructure that combines real-time responsiveness with financial-grade reliability. Across its three data centers in Mumbai, Bangalore, and a hybrid on-prem-Azure environment, PhonePe maintains:

  • Half a million queries per second (QPS) on real-time transactional workloads

  • Sub-millisecond read latencies to support instant payment flows

  • Three-site active-active replication with strong consistency where required

  • 160 terabytes of data per site and over one trillion data records in total

  • 50 Aerospike clusters per site, each designed for rapid provisioning and horizontal growth

Video cover

When real-time speed isn’t enough

For years, Aerospike has powered PhonePe’s high-performance payment and operational systems. These clusters handle everything from primary key lookups and payment authorization to device registration and fraud checks.

But as PhonePe’s ecosystem matured, new challenges emerged that went well beyond throughput and caching. “You want governance built into scale, not bolted on later,” Ramachandra said.

In financial services, where data moves across multiple partners and regulators, that translates into engineering realities such as:

  • Data isolation between partners and institutions

  • Role-based access controls (RBAC) and fine-grained permissions

  • Retention and purge policies that meet evolving regulatory standards

  • Fault-tolerant, multi-data center consistency

  • Comprehensive audit trails and schema-level governance

Try Aerospike Cloud

Break through barriers with the lightning-fast, scalable, yet affordable Aerospike distributed NoSQL database. With this fully managed DBaaS, you can go from start to scale in minutes.

Evolving Aerospike into PhonePe’s trust engine

To meet its dual mandate of real-time speed and regulatory trust, PhonePe reimagined how Aerospike fits within its broader architecture. 

Instead of using Aerospike solely for real-time transactional workloads, such as payments and authentication, the engineering team began using it as the core governance layer. 

Aerospike’s shared-nothing architecture and predictable performance gave PhonePe the foundation to extend its database beyond transactional use cases. By embedding governance controls directly within the data layer, the team avoided the need for complex middleware or bolt-on compliance frameworks. The result is a platform that meets the long-term rigor of regulated systems while sustaining real-time millisecond performance at a national scale.

Instant transactions at population scale

Cross-datacenter latency is ultimately limited by the speed of light. As Ramachandra explained, “Setting up geo-replicated clusters in the country is incredibly hard. The network round-trip time between Bombay and Bangalore is about 20 milliseconds. That’s insane if you’re talking transactional workloads.”

That latency constraint shaped how PhonePe built its real-time data fabric. Rather than chasing theoretical zero-latency, the team engineered around predictability and ensured every operation behaves reliably, even under imperfect network conditions. With Aerospike’s Cross Datacenter Replication (XDR) and PhonePe’s in-house conflict-management logic, transactions remain dependable even when users switch regions mid-flow. 

Real-time intelligence across the data layer

Beyond transactional systems, Aerospike powers PhonePe's real-time data platform, serving as a feature store and aggregate store for analytics, personalization, and risk scoring. These clusters handle hundreds of thousands of queries per second, supporting use cases from location-based services to fraud detection models that rely on instant access to the latest transaction data.

Each Aerospike cluster manages a blend of workloads: small, high-frequency lookups for live transactions and larger scans that feed analytical processes. Strong consistency ensures that data used for feature computation and serving always reflects the latest state of the system.

This layer bridges real-time operations and analytical insight. It enables teams to keep feature data current in real time, unifying transactional and analytical processes across the platform. Aerospike’s predictable low latency performance allows the same infrastructure to power both immediate decisions and deeper analytics, reducing duplication and simplifying data flow across the system.

Caching for continuous availability

Aerospike also anchors PhonePe’s transient data layer, storing short-lived metadata for payment flows, merchant onboarding, and document handling.

Deployed across both VM-based and bare-metal environments, this layer handles tens of thousands of requests per second with predictable latency and minimal operational overhead. Strict time-to-live (TTL) controls handle automatic cleanup without creating fragmentation or flash wear.

By replacing multiple cache frameworks with a single Aerospike layer, PhonePe ensures responsiveness and resilience during traffic spikes.

Governance built into the core

Within its governance layer, Aerospike serves as the foundation for PhonePe’s schema registry, ensuring every data definition and event is traceable and auditable from the moment it’s created.

This design meets the stringent demands of financial-grade workloads by embedding governance directly into the data layer rather than adding it afterward. With Aerospike at its core, PhonePe has created an architecture that blends performance with precision, where real-time operations and long-term compliance coexist in the same system.

Aerospike enables this by supporting logical data isolation for each partner or business line, enforced through a custom RBAC layer built on top of its namespaces. Retention policies can be fine-tuned to match each institution’s regulatory requirements (whether data must be preserved for five years, 10, or more) without compromising performance.

Aerospike also underpins compliance-driven purge and archival processes, minimizing fragmentation and flash wear even during large-scale deletions. Active-active replication ensures that these controls are extended across data centers, coupling compliance and resilience within a single architecture.

Webinar: AI innovation in digital banking

Banks can’t fight tomorrow’s fraud with yesterday’s infrastructure. Join this webinar to learn why more data means stronger fraud prevention, how online AI/ML beats offline, and how leading digital payment platforms stop risk in real time with Aerospike.

Results that keep India’s payments moving

When PhonePe’s governance-driven architecture went live, the gains were immediate and measurable. Across environments, the platform now delivers:

  • Consistent sub-millisecond reads even during peak festival or sale traffic

  • More than half a million queries per second without latency spikes

  • Predictable horizontal growth with minimal operational effort

  • Auditable data lineage and isolation verified within the database itself

Lessons from PhonePe’s experience

PhonePe’s journey with Aerospike is a playbook for building systems that blend performance, predictability, and governance at scale:

  • Governance must be native, not layered on: Compliance and isolation shouldn’t live in middleware. By embedding governance within the data layer, PhonePe removed the operational drag that usually follows audits and partner integrations.

  • Predictability matters more than peak performance: In real-time finance, a single millisecond of delay can cascade into customer impact. Aerospike’s predictable latency under mixed workloads gave PhonePe the stability to scale confidently.

  • Unified architecture beats complexity: A three-site active-active setup might sound complicated, but compared to orchestrating multiple database and cache frameworks, Aerospike’s architectural simplicity reduced maintenance overhead and failure points.

  • Observability and automation are the next frontiers: Understanding that even the most reliable systems evolve, PhonePe’s engineers continue to advocate for stronger configuration management, benchmark tooling, and community-driven observability frameworks.

What’s next for PhonePe

After stabilizing real-time performance and governance, PhonePe’s engineers have turned their focus to rethinking how data moves through the system.

The team is experimenting with a micro-batching pipeline built directly on Aerospike, inspired by Apache Spark’s predictable time windows, to replace heavier Apache Flink and Apache Kafka stacks. By processing events in tightly bounded batches of about 100 milliseconds, they aim to reduce infrastructure overhead while preserving real-time guarantees across regions. Early prototypes show promise in lowering costs and simplifying data movement between clusters for both analytical and operational workloads.

India Summit 2025

From reimagining digital payments to powering massive personalization, innovators from PhonePe, Airtel, CRED, Flipkart, InMobi, and mPokket took the stage at Aerospike India Summit 2025 to share how they’re transforming real-time data into real-world advantage. See what’s shaping the next decade of data-driven innovation.