How Unity replaced Redis to scale its ad platform to 10 million ops
See how Unity moved from Redis to Aerospike to power 10 million operations per second with sub-10ms latency across 100TB of data for real-time mobile game ads.
If you played a mobile game this week, odds are high that Unity was involved. But Unity is more than a game engine. The company also powers multiplayer sessions, analytics, and live operations, and increasingly, the ads that fund free-to-play games.
Mobile game advertising seems simple at first glance: a player sees an ad, clicks it, and maybe installs the advertised game. But behind the scenes, Unity's ad platform must decide which ad to show, to which player, at which moment, and it must do so in a non-intrusive way milliseconds before the player notices any delay. Each of these systems faces similar infrastructure challenges: real-time decisions, predictable latency, and global scale.
This is the story of how Unity used Aerospike to build an ads infrastructure that handles over 10 million database operations per second, with sub-10 millisecond latency across 100 terabytes of data.
The broader AdTech ecosystem and ironSource
The monetization engine Unity runs today was originally built at ironSource, a company that pioneered rewarded video ads for mobile games. With ironSource, if a player needed extra in-game currency, they could watch a short video ad and earn a reward rather than having to pay directly. This way, the advertiser got in front of engaged users, the publisher earned revenue, and the player got what they needed without spending money.
The company scaled rapidly. By the time Unity acquired ironSource in 2022, the platform was already reaching millions of users and driving thousands of installs per minute. But that growth came with architectural debt.
ironSource started with Redis for key-value storage, which worked well enough on a smaller basis, but as traffic grew, the Redis cluster proved hard to scale. Sharding, resharding, and data partitioning had to be handled in code, and eventually, the system couldn't keep up.
As Gil Shoshan, a back-end developer on ironSource's original mobile delivery team, explained: "This legacy architecture that was created over the years became scattered and complex. To get a full view of a transaction, we had to query several databases. This legacy infrastructure became a technical burden, holding us back."
Monetizing games means real-time infrastructure
Most mobile games are free-to-play; ads fund them. According to Unity’s financial reports, the segment that includes ads and monetization contributes over half of Unity's total revenue.
But revenue depends on ads that feel native to gameplay. A rewarded video that interrupts the action frustrates players; one that appears at a natural pause, between rounds, or after a level, feels like part of the experience. Relevance matters too. An ad for a puzzle game shown to someone who only plays shooters is a wasted impression.
Unity needed three things to make this work. First, the ability to process hundreds of thousands of requests per second, each one handled quickly and reliably. Second, machine learning (ML) that ran fast enough to personalize ads in real time. And third, a database that could keep up, grow with the business, and stay flexible as requirements change.
How Aerospike powers the system
The first service ironSource migrated to Aerospike was one of its most critical: user counter capping.
To avoid showing the same ad to the same player too many times, the platform tracks counters for each campaign, user, and time window, including hourly, daily, and lifetime caps. The requirements were demanding: 100,000 read operations and 70,000 write operations per second, with responses in under 100 milliseconds. Redis had handled this at a smaller scale, but as Shoshan explained, "We couldn't add any more IOPS or storage, and our DevOps team's effort to support the cluster was becoming overbearing."
ironSource tested Aerospike as a replacement. The Redis migration was straightforward; the API felt intuitive to developers, and support for complex data types (CDTs) allowed the team to maintain the same data schema they already had. Counter data fit naturally into Aerospike's native map operations, which are atomic by default.
The results came fast. "Very quickly, we managed to stabilize the system and achieve our performance goals," Shoshan said. The team reached 130,000 reads and 75,000 writes per second with 16 instances. Uptime hit nearly 100 percent. For the capping service specifically, latency dropped to under 1 millisecond.
But the more telling feedback came from operations. When asked about the Aerospike cluster for a conference presentation, one DevOps team member replied, "I barely touch the cluster. It simply works."
That stability changed how ironSource thought about Aerospike. What started as a fix for one overloaded service became a platform the organization could build on. Additional use cases followed: session state, behavioral histories, and ML feature storage. Each one benefited from the same characteristics that made capping work.
Under Unity, the Aerospike deployment has grown far beyond those early numbers. Today, the system maintains sub-10-millisecond latency, even across 100 terabytes of data and 95 billion objects serving 10 million operations per second.
What makes Aerospike fit this workload comes down to a few key capabilities:
Hybrid Memory Architecture: Aerospike keeps indexes in RAM while data lives on SSDs, enabling cache-like read performance without the cost of a pure in-memory database like Redis. This is what allows Unity to achieve sub-millisecond reads at massive scale without requiring tens of terabytes of RAM.
Single-read efficiency: No matter how complex the query, whether it involves nested maps and lists or multiple data types, Aerospike retrieves everything in one read. This keeps response times low even as data structures grow more sophisticated.
Schema flexibility: There is no need to define a schema in advance. As Unity's ML and ad teams evolve their models and feature definitions, the data layer adapts without migrations or downtime.
Atomic operations on CDTs: Maps, lists, and nested structures can be updated in place, atomically, without locking. This is exactly what made the capping service work: counter updates that are guaranteed to be consistent even under massive concurrency.
What has to happen in under 120 milliseconds
When a player triggers an ad request, a dense sequence of operations unfolds, all of which must be completed in under 120 milliseconds. Anything slower risks a visible delay, and in mobile gaming, delay is unrecoverable lost revenue.
As Or Arnon, who works on the infrastructure team at Unity, put it, "You snooze, you lose.”
Request enrichment
The moment Unity's SDK sends a request, gateway services attach context: region, device type, app version, session history, and player attributes. This metadata shapes everything downstream; for example, a player in Tokyo on an older Android device will see different ads than a player in London on the latest iPhone.
This enrichment step must complete in single-digit milliseconds, because everything else depends on it.
Eligibility filtering
Next, the system narrows the candidate pool. Budget pacing checks whether an advertiser has exhausted their daily spend. Frequency caps check whether this player has already seen a particular ad too many times today, this hour, or ever. Publisher rules check whether the game developer allows this type of creative.
Each of these checks requires atomic counter reads and writes against Aerospike, often thousands of them per request, all happening at extreme concurrency across millions of simultaneous players. If the counters are slow or inconsistent, the wrong ads get served, budgets overspend, and players see the same ad repeatedly.
ML feature retrieval and scoring
Before the models can score anything, they need features: behavioral signals, historical engagement data, player preferences, and contextual attributes. These features now live in Aerospike and must be retrieved in microseconds. Multiple models are then run in parallel, each estimating a different dimension such as likelihood to engage, expected revenue value, or relevance to this specific player.
Ranking and selection
Finally, the engine ranks the scored candidates. The winning ad is selected based on a combination of predicted engagement, advertiser bid, and platform objectives. The result is returned to the SDK, and the player sees the ad.
Attribution, analytics, and FunnelData
The same infrastructure that serves ads in real time also powers what happens after: attribution, analytics, and fraud detection. When a player sees an ad today and installs the game tomorrow, Unity needs to accurately connect those events, even when the process spans hours or days and crosses multiple apps.
To handle this, Unity built FunnelData, a unified system of record that stores each user's full journey in a single Aerospike record. The scale is substantial: 28 terabytes of data, 12 billion objects, with sub-2-millisecond latency.
But the real story is what this architecture enables:
Instant, accurate attribution: When a player installs a game days after seeing an ad, the install event slots into the correct user record. Late conversions, in-app purchases, and downstream events all resolve to the right attribution window.
Fraud detection against complete data: Reconciling user journeys across fragmented systems creates gaps that fraudsters exploit. A unified record closes those gaps.
Better ML training data: Models trained on months of behavioral history outperform models limited to recent sessions. FunnelData gives Unity's data science teams access to the full picture.
FunnelData runs on Aerospike's all-flash architecture, where both data and indexes live on SSDs. This might seem like a performance compromise, but the numbers tell a different story. As Arnon explained, "In the normal world, if we wanted sub-2 millisecond performance, we would have to copy everything to memory. We would need about 80 terabytes of RAM."
Instead, Unity uses 1.2 terabytes of RAM total, roughly 1.5 percent of what a traditional solution would require. All data and indexes live on disk, yet every record, whether accessed a second ago or untouched for months, returns in under 2 milliseconds.
“There are all kinds of solutions and smart patents here, and we can use Aerospike to do that,” Arnon said. "We just enjoy it without thinking about it."
The strategic impact of Aerospike
Unity has been running Aerospike in production since 2016, well before acquiring ironSource. What began as a lightweight use of the Community Edition evolved into a core part of Unity’s infrastructure, entirely separate from ironSource’s adoption.
Today, thousands of production pods talk to 14 Aerospike clusters. Over 7,400 CPU cores run Aerospike workloads, many of them on AWS Graviton instances, which Arnon described as "the meeting point between efficient compute and an excellent database."
The infrastructure enables Unity to deliver better ads at the moment they matter most. Predictable latency means the platform can run more sophisticated targeting and ranking logic without exceeding the 120-millisecond window. Access to long-term behavioral data in FunnelData improves model accuracy. Bounded tail latency reduces the blast radius of traffic spikes. Operational simplicity frees engineering time for product work.
"From the early days to the crazy scale we run at today, Aerospike is an integral part of our story," Arnon said. "This is not some kind of POC. It is a working platform."
