We are excited to be a part of AWS re:Invent 2024. Visit us at booth #1844 in Las Vegas.More info
Customer story

Enabling Real-time Personalization via Machine Learning

About Sony Interactive Entertainment

Sony Interactive Entertainment is a multinational video game and digital entertainment company owned by Sony Group Corporation with headquarters in the U.S. and Japan. PlayStation is Sony’s popular video gaming brand consisting of game consoles, handhelds, a media center, smartphone, online services, and magazines. The company has sold over 545 million Playstation consoles globally. Its online service, the PlayStation Network, includes a virtual market to purchase and download games and multimedia, a subscription-based online service called PlayStation Plus, and a social gaming networking service called PlayStation Home.

Challenge

Personalized decisions for 100s of millions of users

In 2016, after massive success with PlayStation 4, Sony decided to become a data-driven company. With huge amounts of data gathered from 103 million active users, 38.8 million Playstation Plus subscribers, and 5 million virtual headsets, Sony was well positioned to create a machine learning platform that their development teams could use to make models for personalization, enterprise reporting for business decisions, fraud detection, and more.

The challenge, however, was making the data accessible to the teams and data scientists. Sony needed a solution to handle hundreds of millions of users and several terabytes of data in a useful location, determine how to use the data to better understand their users, and then make decisions to personalize the customer experience. The platform needed to provide:

New heights icon

High availability, low latencies at scale

Needed to reliably handle hundreds of microservices, delivering millions of requests, more than 100 billion data events per day, throughout many database clusters and multiple regions.

integrity-icon

Data integration and accessibility

Needed to bring together multiple data islands, formats, and create a common data dictionary that all teams could use to implement sophisticated use cases, like machine learning models.

cost-icon

Reasonable total cost of ownership

Wanted to avoid the expenses associated with vertical scale and be able to plan grow while managing costs.

Solution

A backend database for runtime decisions around 100M+ active users

Sony created a data ocean with federated data ownership for their internal teams, allowing each team that created or sourced the data to own and store that data as a data lake within their own cloud account. A centralized catalog allowed any team or data scientist to access the data to create machine learning models or reports to drive business decisions.

Benefits included:

check-mark-icon

Personalization

Rapid customer identification and authentication with their behaviors and preferences, then customize the user experience in a high-performance environment.

check-mark-icon

Fraud prevention

Avoid fraudulent transactions across platforms and during surges in real time.

check-mark-icon

Avoid fraudulent transactions across platforms and during surges in real time

Subsidize play-for-free games with in-app advertisements before transforming to pay-to-play.

check-mark-icon

Engaging Social feeds

Communicating in-game via chat, messaging and voice. Find friends in-game or via connected social applications. Follows, comments, ratings.

Results

Achieving real-time personalization with a machine learning feature store using Aerospike

With Aerospike, Sony built a lightweight machine learning platform that allowed machine learning engineers to create, deploy and run their models, as well as manage workflows.

8 million TPS

Enabled more than eight million executions per second across database environments, which includes RDBMS and NoSQL systems.

Automatic sharding

Made sharding operationally less painful.

Low TCO

Small cluster was able to handle several terabytes of data.

Low latency

Under 10 milliseconds.

Slash data load times

12x reduction in re-indexing time while providing reliable access to fresh data.

Large scale

Handled 100B+ data events/day and 5TB+ data storage.

Testimonials