Blog

User Profile Store: The gateway to everything personal

george-demarest-600x600-1
George Demarest
Director of Product Marketing
August 30, 2023|8 min read

Any web-facing application with a large user base requiring any kind of personalization starts with a user profile. In fact, user profile stores are one of the initial “killer apps” for NoSQL key-value databases because serving large user bases (say hundreds of thousands to millions of users) was one of the initial challenges that “broke the internet” (or, at least, broke relational databases). With user profile stores running into the millions, billions, and now trillions of records, getting an instantaneous response requires a unique set of capabilities from your real-time data platform.

New requirements of a user profile store

A user profile store consists of current, historical, and derived user/consumer data combined with behavioral attributes that are used in retail and e-commerce to enhance effective consumer engagement. But the list of industries that rely on fast user profile access is large.

User/consumer expectations for banks, telecommunications providers, e-commerce companies, and retailers are sky-high. When a user visits your website, you only have an instant to decide how best to engage them.

Have they been to your site before? Why are they coming back? Do they want to buy something? Return an item for a refund? File a complaint? Follow up on a previous visit?

In terms of compute power and workloads, here are a few data points:

  • Xander (formerly AppNexus) performs more transactions in a day than Visa, Nasdaq, and the New York Stock Exchange do in a month. With 100% uptime for “most months.”

  • Adobe User Experience Platform sees 100,000 events per second, with between 1000 and 10,000 segment evaluations per event, needing latencies of under 100 milliseconds at the edge.

  • The Trade Desk processes 800 billion queries per day, at a clip of 10 million queries per second with under 8 milliseconds latency.

Given these types of requirements, database business requirements for a scalable User Profile Store include real-time operations on large, continuously updated datasets, 24x7x365 availability, easy integration with business systems, and the scale to handle thousands of cascading database interactions at the same time.

Transformation in Ad Tech: When 5 nines is not enough

In many Ad Tech use cases, you must discover everything possible about a user and apply algorithmic rules to determine, construct, and render a response in just a few milliseconds. This user information resides in cookie stores (soon to be deprecated), which include Internet access patterns, personal data, shopping habits, credit scores, and demographics. To access this cookie store and feed the decisioning and rendering engines, one inbound transaction may result in hundreds or even thousands of cascading database interactions.

Much has been made of the oncoming “cookie apocalypse,” and Ad Tech stands out as the ultimate proving ground for user profile stores, which are a foundational technology for Ad Tech. According to Daniel Landsman, global director of AdTech and gaming solutions “In AdTech, user profile stores are all about identity,” he says. You should read his blog on the subject, The Year of Identity.

What many don’t realize about Ad Tech identity workloads is they are among the most punishing real-time workloads on earth. One firm processes about 10-12 billion transactions on any given day. Another reports that they are doing 50 million queries per second.

At AppNexus: “The way I like to think about it is we do more in a day than Visa and Nasdaq and NYSE do in a month” said Timothy Smith, SVP and GM at AppNexus, now a Microsoft company. To get a feel for that kind of scale, listen to what Tim says in his presentation:

“So, getting back to the stock markets in the world, we don’t close ever, seven by 24. There is no closing bell, there is no time that we take to plan offline. So we just have to keep it running all the time, and that’s why the target is 100% uptime versus 99.999. If we don’t deliver at least 99.9 every single day [that’s 86.4 seconds of downtime], it’s a really bad time. And it’s usually 99.99 [that’s 8.64 seconds], and for many months it’s just 100%. It’s not 100% all year, but it’s pretty close.”

Identity graphs and the Adobe Experience Platform

The nature of how companies track users or customers is expanding in unexpected ways. At the recent Aerospike Summit, Sandeep Nawathe, VP of Engineering at Adobe, discussed providing real-time services to their customers via the Adobe Experience Platform.

adobe-experience-platform-2400w

Part of the architecture of the Experience Platform is identity resolution, as one might suspect. Really, any use cases that require interacting with a large user base need a user profile store. But the sheer volume of data requires that data to be segmented in real-time to obtain the instantaneous results needed. Adobe is seeing a peak of 100,000 segmentation events per second, with between 1000 and 10,000 segment evaluations per event. Those events need to be processed in under 100 milliseconds at the edge.

To get a real flavor of the complexity of the challenge and the elegance of the solution, view the on-demand presentation from Adobe on the Aerospike Summit site.

User profiles becoming multi-model

While user profile stores started off being served well by a fast, efficient key-value data model, the nature of the profiles themselves is also changing. Nawathe of Adobe also described how a single user profile may require up to 20 different datasets, and a single user might require up to 50 separate identities to be represented in an identity graph. Identity graphs are growing in popularity as one of the next waves of technology needed by Ad Tech and Mar Tech for identity resolution and for a different method of targeting, especially once web cookies become a thing of the past.

While Adobe and Paypal have implemented their own identity graph using the Aerospike Database, it is a good time to note that Aerospike has recently added support for the graph data model and graph queries in the form of Aerospike Graph. With new datasets being added all the time to be associated with a customer, a document model will also come in handy.

I’ll let you draw your own conclusions about whether you pursue a best-of-breed vs. a multi-model approach to supporting different data models, but there are significant advantages to using Aerospike for graph, key-value, and document data. You deal with one vendor/support team, you can consolidate skill sets, and you benefit from the overall scalability, efficiency, low latency, and low TCO of the Aerospike Real-time Data Platform.

Transunion and Identity Resolution

Signal (acquired by Transunion) created an identity resolution platform. It is used by marketing teams who have a huge amount of data, some online, some in their CRMs, and some offline sources. They can bring all that data into the Truaudience platform and use it to target their existing customers all over the internet.

Signal replaced their existing data store because of latency issues, which were slowing down virtually every element of their business processes.

“The main drivers that made Aerospike so attractive was its total cost of ownership was far lower than the competitive offerings that we had evaluated.” – Jason Yanowitz, former Vice President of Engineering at Transunion.

During the initial deployment, Signal tested up to 8 million transactions per second and saw the p50 at 10 microseconds, which was almost a thousand times faster than what their original solution could deliver. Their p99s went from 3,900 milliseconds to 23 milliseconds, and they gained greater reliability system-wide. Some processes that used to take six days now take 14 hours. Things that took three hours take three minutes now. Check out the case study and video.

Conclusion

Aerospike is present in a significant number of multinational firms that play in the Ad Tech space. As you have seen above, Aerospike withstands a torrent of user profile, fraud detection, and identity resolution operations at millions per second or billions per day scale. Not everyone needs to work on that scale, but nearly everyone grapples with the challenge of managing and resolving identity with a database.

Having optimal user profile stores and instant identity resolution is a basic challenge for most large-scale web applications. These are two areas where higher latencies can directly affect – and destroy – the user experience before they even start using your application.

The companies mentioned above have chosen Aerospike to power these linchpin use cases to ensure their customers’ journey isn’t over before it starts.

For a more technical treatment on the topic, I recommend you read Ronen Botzer’s blog on the subject: Aerospike Modeling: User Profile Store – Audience Segmentation for Personalization.