WEBINARS

How Dream11 built a fantasy sports team service to handle futuristic IPL scale

Share this Page

About this webinar

Dream11 is India’s largest fantasy sports platform, with more than 100 million users, playing fantasy sports across 11 sports like cricket, hockey, football, kabaddi and basketball. Dream11’s key focus is to provide best-in-class digital sports engagement to all sports fans and retain its market leadership. Running with Aerospike’s Cloud Managed Service, Dream11 has their clusters running in the strong consistency model, with high availability achieved using an active-passive setup.

Join this webinar to learn how Dream11 has built a Fantasy Sports Team Service to handle futuristic IPL scale with 99.9% SLA of 15ms at 1M+ transaction/sec and 99.99% uptime with zero outages.

Speakers

Atul Dusane

Team Lead

Hammond Pereira

SDE-III, Product Engineering Dream 11

Aveekshith Bushan

Regional Director - APAC, Aerospike

Webinar transcript

Aveekshith Bushan:

Hardy welcome to everyone. This is Aveek, I work as a general manager and regional director for Aerospike in Asia Pacific. It's my pleasure to welcome two of my friends from Dream11, Atul and Hammy.

Atul Dusane:

Hey.

Hammond Pereira:

Hey.

Aveekshith Bushan:

Dream11 like you know is India's leading sports tech platform. It's literally come to be associated with the IPL, India's marquee sports league, if you will. So it's exciting to have both of them on board. Just as an introduction to both Hammy and Atul. Atul works as a team lead at Dream11. He's been with Dream11 for one and a half years and is a team lead for the Leaderboard Team. Prior to joining Dream11, Atul was with Amazon and also with Amdocs prior to that. He has over seven years of experience in software development and is skilled in technical leadership and problem-solving.

We also have Hammy, Hammond Pereira, who works as part of product engineering at Dream11. Hammy is a software development engineer based in Mumbai. He worked with MX Player before. He was also with Morgan Stanley and he worked with Amazon India as well. He has over five years of experience in backend development and has contributed towards building performance-robust and scalable backend systems over media and financial domains. Awesome to have you both over for the session today.

Okay, so as part of the talk today, a few things that I wanted to just as a start to give you a context to it during the session today, we would request everyone to be participative, meaning there are going to be a few polls that they're going to be running. If you can participate in that, and there are a few attachments that are uploaded. If you can check out those attachments and also fill up a questionnaire you might stand to win three Lego rockets. It's really cool stuff. It has Aerospike logo on top of it and I'm sure you're going to like it.

But this is a context of the whole thing, I think what excites me about this session the most is the fact that we are talking about sports tech at a time when the Tokyo Olympics is just about a few days away, and I think this is the first time where people won't be able to travel to an Olympics and people are going to be watching it offline. A lot of it'll be offline. So I think engagement levels with the audience is going to be tested in a big way during this Olympics.

In some sense, this is going to be the year when sports tech as industry is going to take off in terms of audience engagement, how athletes improve their performance, and obviously the in-state experience that people would get. I think technology is going to drive this in a big way and obviously in that sense, having somebody like Dream11 with their experience working on things like IPL and the whole bunch of cricket leagues that they work as part of has been great and given a great customer experience to everyone.

So maybe what we'll do is we'll start with a fun poll right away. Let's see if I can bring that up to see if the audience are well awake at 4:00 PM. Let's start the poll. So the first question to everyone, we give it 30 seconds. Are you a Dream11 player? Two options, yes and no. I'll keep it open for 30 seconds, 45 seconds really. So I'm sure Hammy and Atul, you guys obviously build this stuff, but I'm sure you guys also play on Dream11 as well, like I do, right?

Atul Dusane:

Yeah, of course.

Aveekshith Bushan:

Yeah. So I think this is no prizes for guessing. It's all a hundred percent yes. For this one, this is a very easy question I guess. Let's go to the second one and let's see if we can do this. This should be more fun. By the way, before I start that I'm pretty sure that RCP is going to win this year's IPL. Obviously I'm from Bangalore, so I'm going to be supporting RCP. Which teams do you guys support? Atul, Hammy?

Atul Dusane:

I'm supporting Mumbai.

Aveekshith Bushan:

Come on, how about you Hammy?

Hammond Pereira:

I think Delhi Capitals has a good chance, probably Rishabh Pant can take us home.

Aveekshith Bushan:

Possibly. Let's see what the audience think. It'll be fun to see if the audience concur with you, or me, or Atul. I somehow feel RCB's year is this time. Like we say, there's a saying in Bangalore "Ee Sala Cup Namde" is what we say. Oh my god, this is Chennai Super Kings. All three of us have lost. So obviously there are folks who are supporting Chennai more than any other team with Dhoni out there I guess. Well, interesting. Let's keep the fun element going guys, but without further ado, let me hand it over to Atul and Hammy to start their talk. Like I said, keep it participative, guys. Over to you Atul.

Hammond Pereira:

Hi guys, I'm Hammond Pereira. I'm the backend lead of gameplay engineering in Dream11. So what comes under gameplay engineering is like if you've played in Dream11, the match center, the contest, the leaderboard, all those things come under gameplay engineering. So we usually take care of that experience. So we are here to explain how we built our scalable and fault-tolerant Team Service architecture to withstand a futuristic scale.

So let me start with explaining what exactly is Dream11 for the audience who hasn't played any contest in Dream11 yet. So Dream11 basically is India's biggest fantasy sports platform, which was founded in 2008. We were the title sponsors of IPL 2020 and the official partners of IPL 2021. We have around 100 million-plus registered users and we actually got around 5.5 million-plus of peak concurrency. I mean this was something we achieved in the last year, which is IPL 2020 final. We also host a bunch of sports, 11 sports in particular, namely cricket, football, basketball, baseball, NFL, et cetera.

So before actually getting into Team Service and the entire tech part, let's first understand how exactly the Dream11 app looks like and what the flows are inside it. If you see on the left side, the first screen, that's the screen which you see when you actually go into the Dream11 app. That's basically the match center and then there basically you see you have a segregation of sports, cricket, football, basketball, and so on.

And then for every sport you have all the matches that are going to take place maybe sometime down the line. So if you see the highlighted part, we can see Delhi Capitals versus Bengaluru, and again these are pretty old screenshots. So just for understanding, when we click on a particular match, so in this case let's say if you want to join a contest in a match between Delhi Capitals and Bengaluru, what we do is we click on it and then the center screenshot is where basically you see the contest listing page.

So there are multiple contests which can be played in a given match in Dream11. So for DC versus Bengaluru, we have a plethora of contests which the users can join. So when you actually, let's say, and again contests can be of multiple types, you can play paid contest as well, you can play also free contests too. And then you can create your own contest. So that becomes a private contest if you want to play with your friends.

So when you click on a particular contest, you see the right-most screen, which is the third screen over here you basically see contest details over there. There you actually see the amount of money that is going to be given out. In this case, it is 2375. So basically that is the total prize money that is going to get given out in this particular contest. And then this money is divided in something which is called as a prize pool.

So when you see the table below, you can see that a Rank 1 gets around 20% of the prize pool, which is 475. Then Rank 2 gets 12%, Rank 3 gets 8%, and so on and so forth. That's basically how when multiple users join a contest and after the end of the match when they're ranked based on the points that they have, the money would be given out based on the prize pool, which is pre-decided.

Also highlighted here is the amount of money that you have to put to join this contest. In this case, I mean it was actually supposed to be 25 rupees, but then promotion was applied so you can play 12 rupees and get inside the contest. Once you actually choose a particular contest then the second thing that you want to do is you need to create your team. So basically in order to compete in a contest, you should have your own team.

So in this case for cricket, we basically have four facets, which is you have to choose players like one to four wicket-keepers, multiple batsmen, multiple all-rounders, and bowler. Every such facet has a particular constraint. So in this case, let's say if you choose AB de Villiers, Rishabh Pant as wicket-keepers and so on and so forth, in the middle screenshot, after you're done with choosing your entire team, what you do is you press continue and then you actually see the entire team that you've chosen.

Now in this case, what exactly happens is every player is associated with a certain credit and then you have a total of 100 credits in which you have to create your teams. There's a constraint on that as well. So once you have chosen your team, again in cricket we have something called as a captain and a vice-captain. So the concept here of captain and vice-captain with respect to Dream11 is if you choose a player to be a captain, whatever fantasy points the player scores, you would get 2X of that.

Similarly, if you've chosen let's say AB de Villiers as your vice-captain and then he scores some X points, then you would get 1.5X of the fantasy points that AB de Villiers had gotten. So once you have your entire team set, all the players that you want in your team to join a contest, and then you have your captain and vice-captain is when you click on Save Team and that's when your team is saved, you have joined a contest, everything is said and done.

Now once the match starts, you can see on the left side screen you are able to see the leaderboard. So in the leaderboard, if you see apart from you, there are multiple other players, users who have actually created their teams and then they along with you are competing in this particular contest. So as in when the match progresses based on the player who actually does something on field, we award fantasy points based on whatever actions the player does.

For example, if he hits a four, we award him four points and then we also give him a bonus on the boundary that he hit. So based on this, your team gets awarded points and then respective of your captain and vice captain it becomes 2X and 1.5X and that's how after every ball that is bowled in cricket, your team is awarded points. So you see in my case here I got 162.5 points and then of course I'm not first here, the person who's first has gotten some 284 points.

So that's how after every event that happens in the match, anything that happens the players are awarded the respective fantasy points and then those points are reflected on your team and then you are ranked in that sense in the leader group. So this is when the match is live. Now let's say once the match is completed, that's it. Now no one would be awarded any fantasy points. That's it. The final points of your team in this case, my team, I mean I put around 528 points, but then there were others who actually scored more than me.

So now once the match is completed based on the prize pool that we saw in the beginning, the winning amounts are allocated based on that prize pool. The person who actually ranked first got around I think 20% of the total prize pool and so on and so forth. I mean since I came 30 I didn't get anything. So that's poor me. So once this is done, then we do a declaration, and then all the money that the users have won is then they get credit to their wallet and that's basically the flow of gameplay in Dream11.

So I'm sure by now you've got a fair idea of what Dream11 is, what it does. Now what we'll do is let's jump directly into the use case. So you see like I explained how in order to join a particular contest, you need to have your teams created, right? So the thing is it's not only you creating teams, we have 100 million plus registered users and then we have around 5.5 million concurrency.

So you see a lot of other users are also creating their own teams, which means that again, entire Dream11 is, I mean constitutes of a microservice architecture. So teams service is one such microservice. So that particular service gets subjected to a lot of load when the users do anything with respect to their teams. So these are the primary functions that the users can create their teams, which is what we just saw.

And then users can also edit their teams. So basically what happens is maybe let's say you've created a team and then you feel that okay, you're not that happy with that team, probably would maybe want to swap a player or two or maybe edit something else as well. So in that case you can again go to your team and then probably switch a player and that's it, and then you can save your team. Also, the most important part is lineups is something which gets us a lot of scale when the match is going to be started basically for an IPL match lineups are around 15 to 20 minutes before the match starts. So that's when you actually get to see, okay, now these are 22 players who are going to play the match, which is going to start in some time.

So let's say, I mean if for example, maybe you've chosen someone like Virat Kohli and almost like everyone chooses Virat Kohli, but then you get to know when lineups are announced that Virat Kohli is not playing. Now everyone would actually go into their app and then edit their team because having a player who's not going to play a match in your team makes no sense because then that player is anyway going to get awarded zero points.

Because if a player is not playing on field, he won't be awarded any fantasy points and then he won't do any good to your team. So that's when lineups are going to get announced we see basically a spike and when some really good players are announced that okay, these are not playing, the spike becomes that much steep. So that's the main traffic that Team Service gets when users edit their teams during lineups announced.

And of course, the third use case being that whatever teams you create you should be able to view your teams as well. Apart from that, we have certain SLAs in this case. So basically, we have for our Team Service, we have a 99.9 percentile SLA of 15 milliseconds. So basically when the Team Service is going to get submitted to around 1 million plus requests per second. So that would be both reads and writes, the service should respond within 15 milliseconds and that should be around 99.9 percentile of times.

Also, we need an uptime of 99.99 percentile and the thing is we need 100% uptime after lineup's announced. The reason being lineups is basically a phenomenon where you actually come to know which players are going to play. And let's say if you are not available after that, it basically means that you have paid money to join a contest, but then when you see a player not playing, you're not able to edit your team.

So that is something which is very, very catastrophic I would say because after lineups are announced, we cannot have any outages. We actually need to be 100% up because that's when we also get a lot of traffic. And then since users have paid their hard-earned money to join contests, we should allow them to edit their teams or to field the team that they actually want and not because of outage to have to not allow them to do so.

Let's see, what was the first shot at solving this particular problem? Now I'm going to explain Team Service 1.0 design. So basically, this is life before Aerospike. As you can see, we had a Team Service which was returned in Node.js And then since it is a microservice, we have an edge layer which talks to our clients and then that edge layer gets us all the requests to the Team Service.

Now the thing is the Team Service had a backing store of Aurora RDS and the thing is since we get subjected to around 1 million plus transactions per second just one RDS is not going to cut it. So we need some sort of sharding. So what we did was instead of having just one RDS cluster, we had five RDS clusters which were distributed. I mean the request the data was distributed using a modular file logic.

Apart from that, we also have read-through cache and Redis, which we basically use for our re-requests. So let's see how the flow is. So whenever we actually get a particular, let's say a re-request in Team Service, what we do is we first check if pertaining to that user ID and maybe pertaining to a match ID, is that user's team present in Redis or not. If it is present in Redis, we of course form the request through Redis, but if it is not present then we need to go to the database because it is stored in one of these shards.

So what we do is since the service maintains a constant connection with each of these shards, what happens is it actually checks the user ID in the request and it does a modular file. So based on the output of the modular file, it would either go to Shard 0, Shard 1, Shard 2, Shard 3, or Shard 4, and then it would satisfy the request. For write parts what usually happens is whenever we get a write request, of course in this case now Redis is not consulted because we need to write the data in the database as well since it's a read-through cache we write in Redis as well. So whenever we get a write request, we do a modular file, we see which particular shard the data has to be returned to and then we go to that particular shard.

So now let's see what other limitations of this particular design. Now it's particularly very clear that adding shard is a manual activity. Now you see the thing is all around the year we don't get IPL-like scale. I mean IPL is basically for two, two and a half months and that's when we get the most amount of scale. So that's when we need five shards and five big shards. But then when the supply is low or when there are not so many matches, it's like an overkill to house so many shards.

So basically getting rid of shards and then when there's no scale at all and adding an extra shard when IPL is coming that actually is very tedious. Because the thing is since the application has to know which shard it needs to connect to, you have to go and make changes in the application to make it such a way that next time a request comes, it goes to a newly added shard.

So the thing is this particular logic of modular file, we call it synthetic sharding wherein the database are not taking care of our sharding needs we have to have a logic which does that. So adding in the new shards is totally manual and not elastic I would say. One more thing is that the story doesn't end here with this thing solely because the data that is there in these shards it is actually required by our other microservices as well. Because you need to show teams in some other screens of our Dream11 app as well, one being My Matches, where you are actually able to see the matches that you've joined contest in or you've created a team.

So the thing is we have pipelines from these shards to our external services and we use a bin log mechanism to actually transport data from shards to the other microservices. The thing is adding in domain shards will need you to add the extra pipelines as well and then make sure that okay, data is actually flowing through the other services, and adding and removing shards would keep on increasing or decreasing that modular end. So you would need to change your application logic each time you do a resizing activity.

And finally, the thing is querying for data is tedious because you have to do a mental model in your mind for this particular user, probably the data is in Shard 3. Now let's go to Shard 3 and then query for it. So if you see all these limitations are merely operational limitations because I mean from the looks of it it actually takes care of all our scale needs, but then operationally it is a bit tedious. So let's see what happened. The thing is, we were with this kind of particular architecture until 2020, until the outage happened and basically this outage happened in the IPL semi-final match between Sunrisers Hyderabad and Royal Challengers Bangalore. So let's exactly check the details of it.

So you see during that match what happened was in Shard 4 to your right below. In Shard 4, there was a memory issue because of which the entire shard became unresponsive. Now since that shard became unresponsive, any communication that the Team Service used to do with Shard 4 that used to get timed out. Which means that since we have five shards now and one shard is totally unresponsive, which means that basically, 20% of our users are not able to create view or sorry, create view or edit their teams.

And now the thing is, the problem with this was the shard went down after lineups were announced. So you see, I mean this was the actual catastrophe, which I was talking to you in previous slides. That this really happened after lineups were announced in an IPL semifinal match. So now probably it's a double whammy I would say, but this was just not it. I mean you would say that "Okay. One shard has gone down, so okay, 80% of your users are fine." Now the thing is that the communication between edge layer and Team Service consists of a circuit breaker. So now if the circuit breaker sees that the upstream services misbehaving, it'll open the subject and then for some time, you would not be able to have a steady traffic movement between the edge layer and the Team Service.

So in this case, 20% of the users were not able to do it like create, edit, or view their teams at all, but then the other users probably around 20, 30% more of these users over and above the 20%, which were not able to do anything so those users were able to see some intermittent errors on the app. So all these things boiled down to a trust issue, if you will because our business relies a lot on trust.

So now if you see that after lineups, you're not allowing the users to edit their teams, it means that okay, maybe something fishy is happening and then probably we don't trust the app. So that is a huge, huge red flag for us. Now, what happened after the match was since people weren't able to edit their teams, we actually had to refund all the money that users had paid.

So not only was there a loss of revenue for Dream11, but the intangible aspect if you will, which is trust was breached and this was a huge, huge red flag during the last year's IPL. So the thing is with the operational issues, we were like, "Okay, as long as the service is performant and fault-tolerant and reliable, probably we can bite the bullet and then live with the operational issue. But then when this happened, we thought, "Okay. Now we have to do something about it. Probably let's revamp the entire thing."

So we actually zeroed into these tenets for designing a new architecture. The first tenet was a scalable architecture, whatever new architecture we come up with that has to scale and it has to scale seamlessly. Which means that if a lot of traffic comes in some months, you just add nodes and that's it your scale requirements are met, you don't need to do anything else.

And let's say when you are in months where probably the supply is low and not many matches, then you can downscale the entire thing. I mean, so that should be very seamless and not painful at all. Apart from that, we wanted the entire architecture to be fault-tolerant because we cannot afford another IPL fiasco. And of course, we wanted the architecture to be durable. So once a user creates their team, that data should not go anywhere. That data should be there it should be standard.

The final tenets where strong consistency and high availability. Of course, we need high availability up anywhere all the time, but after lineups are announced, we need 100% availability, but also we need strong consistency. The reason for fault tolerance, we may have replicas of our service nodes, our database nodes. So the thing is, in order for this to happen, the data has to be consistent in all the nodes because you cannot have a user edit their team and just after some time they're able to see their old team.

So once they have committed their team to the database, they should see that particular view always. So strong consistency and high availability is something which we needed very, very badly. Now you say based on CAP theorem, either you can have CP, which is consistency and partition tolerance, or AP, but then in this case we wanted both. So let's see how we solve for that. So I would invite Atul to go forward from here.

Atul Dusane:

Yeah. Thanks, Hammy. Hi, guys, I'm Atul Dusane. I've been working in Dream11 since one and a half year, and I was majorly involved in designing this new Team Service architecture. So as Hammy explained, we had a lot of tenets, like scalability, then high performance and fault tolerance consistency. We explored a lot of database and finally, we drilled down to Aerospike.

So this is a very high-level architecture of our design. So this is the edge layer where the devices or the apps, like desktop, they all talk to, and this is Team Service. If you saw the previous design, the old Team Service was in Node.js, this is a reactive framework written in Vert.x and it is in Java, which we drill down to Aerospike database. All the data is stored in Aerospike.

Okay. So just to tell you about Aerospike, we are using Aerospike as a key-value store. So our key is something user ID and round. And for the values, we are storing a map of the teams. So for a user, we have multiple teams stored in it as a map. So in Aerospike, the keys we are using it in a hybrid mode. So in hybrid mode, all the keys are stored in the memory and all the data goes into SSD.

So the keys are divided like using a hashing algorithm, RIPEMD-160 and these are divided into 4096 partitions. So all your data is evenly distributed we do not have hot key issues, and the teams are actually stored in SSD. Also, along with that we have, so this architecture, we are using a Aerospike with the replication factor of 3RF. So there are always three copies of our data.

Also, along with that, we are using EBS. So what is EBS is, even when the data is moved like it is copied to SSD into the replica, it is also copied to an shadow device an EBS drive. So let's say if our node goes down, so there will be migrations, the master will move to another node and you will always have that data. But let's say you are three nodes down, we have 3RF. If three or more than three number of nodes goes down, there will be a data loss. So what to do in this case is like we have this EBS shadow device. What we will have to do is we have to just spin up a new node with Aerospike and attach this EBS and the data will be loaded back. There will be some data loss, but still, it will be very minimum if you use this shadow device.

So apart from this, Aerospike has something called Smart Client. Smart Client is a client which knows where the data resides. So it maintains a map of master and its replicas. So whenever your service needs to or the client needs to fetch your write data to Aerospike, it directly jumps to that node. It can go directly to the master and get the data. Likewise, if you see in any other traditional NoSQL database like the client talks to a coordinator or a seed node, and then the seed node decides where the master is residing or from where to get the data. So here the number of hops are actually reduced in Smart Client. Also, Smart Client maintains a separate pool of connection for each of the node which gives much higher performance.

So as Hammy explained, consistency is a very, very critical factor for us. So there are a lot of devices, users are logged into multiple devices, they can also log into desktop and not, so for all these cases, they can create a lot of teams or edit teams within fraction of seconds or milliseconds. So it is very necessary for us to give them a consistent view.

This affects a lot of trust factor if we are not giving a consistent view to the user. So consistency plays a very major role in our architecture and Aerospike helps us with this consistency. So Aerospike works in two modes, one is available partition and one is strong consistency. So in strong consistency, it makes sure that whenever you write to the database all your writes are replicated and only if they're successful they'll be returned back to the client.

So either you are always sure that your writes are either successful or not successful, they will never be partially successful. So we are sure about that consistency. Also in case of there are network partitions or network failures outages, it may happen that your cluster is split and there can be multiple sub-clusters. And the problem with multiple sub-clusters is each of the sub-cluster may think it is the master of its own data.

What happens if the client is connected to a wrong sub-cluster which does not have the data? It may think that the data is not there, it may try to override the data and this overall impacts the consistency of the data. So how does Aerospike protects us against this? So Aerospike it makes sure that even if in case of split-brain there are multiple sub-cluster, it makes sure that you will be always writing to one of the sub-cluster only.

So this Aerospike does it by maintaining something called a roster. So roster list is a list of nodes which contribute to form a cluster. Anything you want to add to the cluster, this roster list has to be updated. So for the reads, we are using linearized read. So in linearized read whenever we want to get data from Aerospike, it gets data from the master, but the master also consults where all the data resides and it takes if it has the latest snapshot and then only it'll give back to the client.

So with this strong consistency architecture, we are protected against like stale reads or dirty reads and our data is very consistent. Now if you see that, even if, let's say so as Hammy mentioned, we have scaling issues and we have a performance issue, these are all solved, and consistency issue this is also solved. What about availability? So let's say a few of the nodes go down and it may take some time to bring them up and we cannot really afford this much of time, at least when the lineups are around or the match is about to start.

So in this case, we had to solve this problem using something called... So we tried to solve this problem and we came up with a new architecture with active and passive DBs. So if you see in the left side, it is the edge layer. Edge layer is talking to Team Service, the same Team Service which I explained it is in Vert.x. And then we have two database, both are Aerospike, one is active and one is passive. And of course, they have EBS residing on it. These both active and passive, they will be again on different set of agents so the probability of both of them going down, it's much reduced.

So whenever from edge layer, the request comes to Team Service and Team Service does all these validations. And what it does is first it writes to the active database. So whenever it writes to active database, it immediately gives the response back to the service like it has written successfully, but also at the same time, it'll write to a passive database asynchronously.

So let's say there was some network issue, the asynchronous write to the passive database fail or there was a network outage or the passive Aerospike went down, the user is not impacted. The user just gets a response back when we write to the active database, it is returned to passive or not written. So whenever data is not written to passive, we still have to make sure both the database was always in sync. Let's say because of cloud send error, one of the team was created in active but it didn't get created in passive so for that we had to sync those database.

So how do we sync it? It's like we have a custom consumer. If you see for the active one in the right side there is a green box it is a custom consumer. And what it does is it gets the record from Aerospike and it will sync it to passive. So how it does it for Aerospike, we have enabled through, there is an application Aerospike Kafka Connect, which actually pushes data from Aerospike to Kafka.

And from this Kafka, this consumer research, this pulls the record and then it checks if that record is present in the passive DB or not. If it is not there in the passive DB, it'll overwrite it or it'll push that data into a respective passive one. So this way we are sure that whatever is returned into active and if it is missed in passive, this consumer is making sure that it'll sync up the passive. If it is already present in the passive, it'll just ignore the record.

So same way, let's say the active server went down by some issue, there was node failure or hardware failures and maybe some network issue, this active DB went down, then we need to switch to the passive DB. So how do we switch it? We have, if you see to the left side to the Team Service, there is a Redis Pub-Sub. We basically publish some message to this Redis Pub-Sub and this Team Service is actually subscribed to that channel. So it gets a message that you have to switch the cluster and it immediately switches to the other passive DB.

So if you see the green line that is now the active pipeline, the clusters have been switched, the green box of Aerospike has now become the active DB. So you publish a message of cluster switch the Team Service reads it, and now it starts writing to the other passive cluster. Now that has become the active cluster. And again now the records which were returned in the active, when the passive will come back again, the new passive which will be coming back again, the records will start syncing from the consumer again. So this is how we always keep data in sync with active and passive.

Okay. There is one more thing we had to put in here is the filters. So whenever we do a dual write, so records will be pushed back, pushed to Kafka from both the passive DB and active DB also. But we do not want to sync records from passive DB to active so we need to filter out such a record. So this filter, this is an application written in case just filtering out the records from passive.

So only IT records from active DB will be going ahead to the consumer and the consumer will be syncing it to the passive. So this way we make sure that we do not have a cycle in it. Only half the flow is working at a time. I'll just move to... Okay. So now that you have understood that we have two Aerospike and we want to show a consistent view to those users. So in this case we have two pipelines. One is from active and one is from passive. So we have to have one application that is something we have written, it understands which is the active and which is the passive.

So any of our downstream application which need real-time data so this switch makes sure that it gives one view of it. So our downstream applications like notifications or My Matches or any real-time analytics, which we have to do, they're not aware of what is active or is passive, they get real-time updates in a separate topic. Also, if you see from the active Aerospike, there is an arrow going to the Spark. So that is after lineups we do not allow editing and creation of teams.

So in this case, so if any of the applications or any of the downstream applications, they need a static view or the whole view of the match, it is pulled into Spark and it is loaded into S3. So any downstream application want to have a static view, they can read it from this S3. So this is how we maintain real-time queue we have that so our applications don't have to worry which is active, which is passive. And for the static view we uploaded to S3 using Spark.

Aveekshith Bushan:

Hey Atul, can we do a quick poll involve the audience as well before you move on?

Atul Dusane:

Yeah. Just to-

Aveekshith Bushan:

Maybe we just do a poll with a more technical question this time. So guys make sure you answer this. The question is about which high-performance data architecture do you use, is it Redis cache with an operation database? Is it a cloud database platform with the cache? Is it a single layer of Aerospike hybrid memory or is it some other platform? And while you're answering that, for those of you who posted questions, we'll make sure we take it at the end, keep those questions coming in. Okay, so we have a whole bunch of people who are using other platforms right now, which is great. So this should be useful for you guys. Awesome. So if you can... Oh no, I'm sorry, the final numbers are Redis cache, 37% Aerospike, 37%, cloud database, 12% and other platforms 12%. Okay, awesome. Thank you. So maybe you can proceed, Atul.

Atul Dusane:

Yeah. Thanks. Thanks, Aveek. So overall, if we want to summarize what all features of Aerospike we have used. So one is strong consistency with 3RF and linearized reads. I want to specify why 3RF and linearized we are pushing, I'm stressing it out is because Aerospike actually recommends 2RF. 2RF is the number of acknowledgements in the reading data it's much less than 3RF. It is not like in 2RF you have two acknowledgements and in 3RF it's three or 3RF the performance is at a much higher level than 2RF.

Okay. So this is like 3RF still we went with 3RF because for us we have to make sure we are never at data loss. There are no failures in it. Then there is this rack awareness thing we have used so it ensures that all the replicas or the master and the replicas are always distributed in different racks. In our architecture rack is something easy because we have used AWS.

So our data is actually split into different racks. So even if one of the AZ goes down, we can still suffice. Our architecture will be always on, even if you use this one A goes down or B goes down. So it is very unlikely, but still, our architecture will suffice it and it will be still on. Hybrid memory architecture that we have discussed. The primary keys in memory and data is in SSD we have used shadow devices also we have used transport layer security.

Then there is this XDR. XDR we are using it let's say even if our Kafka goes down so basically when passive goes down we are syncing data. We are getting the data from Kafka and pushing it to the passive from the consumer. But let's say the Kafka also went down, what to do in this case? So in this case we have this XDR facility given by Aerospike. What does XDR do? Is it just syncs data from active to passive or it can sync data from active to the Kafka?

So right now we have put it in the dormant, we have configured it, but we have put it in dormant. So in case if it is required we can just turn it on and we can just replace the data. We can even replay two hours of data, three hours of data. If you want to replay the whole data, we can still do that. So all these things are provided by XDR. Then there is this Aerospike Kafka Connect view, which is used to push data to Kafka. There is this, Aerospike Connect for Spark. So this is also very fast way of retrieving data from Spark.

So the Spark connector actually you can mention your data has to be distributed in how many partitions in Spark. So the re-partitioning thing is we don't have to re-partition your data in the Spark. So it has very high performance. We actually load almost 20 million records within seven to eight seconds using this Spark connector. And it's really, really very fast. Moving ahead so all this architecture is there, it's good in theory but we have to test it very, very... It has to be battle-tested. So there are a lot of test cases, but I'm covering the test cases in perspective of Aerospike on here right now.

So what our writes are there, we have 75K writes per second and 925 reads per second. So these are all linearized reads and our ratio of writes to read is almost 1:12. So with this high scale, we had to see how we are scaling our cluster. So we tried to add nodes to it, bring down the nodes with this high throughput. Then we went on with single-node failure, multi-node failure, multi-node failures can be less than the RF number of nodes are failing or more than RF number of nodes are failing. What happens with the whole agency is down because we have sustain even if one has gone down.

So these are scenarios we had to test on high scale then split-brain scenarios we had to test. Then, of course, there is cluster failure where we switch from active to passive with how much speed the data can be loaded into Spark and what is the delay when we sync through XDR. So all this we had to battle-test out and we got good results with it. So let's see some stats out there.

So this is a Datadog dashboard of our Team Service. If you see there are 1 million requests per second so we have a P99 of 15 milliseconds and if you see down, we have two Aerospike, Blue and Green. It is working on r5dn.8xlarge and R518. Both are 8xlarge and 14, 14 nodes and the whole service is giving the results in 15 milliseconds. If you see the network in and out, it is much, much higher 7 GB to 20 GB. Why is that? Because we are using linearized read and 3RF. So in linearized read, every read has to consult with the other replicas also if it has the current or the latest snapshot of it. For this, if you see linearized read will always give a higher network in and out.

So let's see what all we have achieved in IPL right now. So this is unfortunately the IPL got suspended and we couldn't test everything but until now in this match Mumbai Indians vs Sunrisers Hyderabad we got a read of 21.2K reads per second. Sorry, we got a write of 22.5K writes per second and a read of 212K reads per second. So this is our read and write and if you see to the right side we have read latencies below 2MS and write latencies also below 2MS. So even in a linear write and 3RF, we are able to achieve this much of latency. So 2MS is working very good for us.

So also I want to mention about Managed Services. So what happened is this is the first time we were using Aerospike and strong consistency mode. We have usually used Aerospike in AP mode and in memory. And this is one of the critical services in our whole Dream11 ecosystem. And we have to make sure it is always, always on and if there are any problems and all, they are mitigated very fast. So this is the first time we have onboarded to Manage Services.

So Managed Services, all we had to do is give them an AWS account and they will take care of our scaling needs. They will scale up, scale down, they will make sure the purchase is done correctly. The certificates are uploaded, certificate updates are done by them. So even if there is scaling issue, they'll recluster it. They have to set the roster. So all this heavy lifting is done by them and we don't have to worry about it. We just have to take care of our business logic and they will take care of it.

They even check the configuration, tuning, all these things are done by the Managed Services. So we actually before onboarding, before moving to Aerospike, we actually did a lot of exercise on scaling. We had a lot of attritions of testing and finally, we came out of radio-optimized topology. So what we have is we have staging and production clusters. If anything, if we want to test it, like we first test it on a staging cluster that is also on the Managed Services. So once we are good with the staging cluster, we will get it deployed on the production one.

So this staging cluster is also like a on-demand cluster. Whenever we want it, we'll ask the team to spawn the staging cluster and we'll do our test cases and then we'll move to the production cluster. So Aerospike manage cluster has even helped us in reviewing our architecture. We could get comments and advice from the Aerospike architects and that was very helpful for us.

We have a lot of alerting and monitoring on them. They continuously keep on monitoring it. We have contracts with them for P95, P99, P99.9, any anomalies they have, they will just inform us. We even have a separate Slack channel, which is for our communication. So if anything goes down, if there are problems we can just communicate with them using a Slack channel. So such kind of support we have from them.

Also during let's say in IPL, we really, really cannot afford any delay in communication. So in IPL, we have dedicated support from Managed Services team. So someone from the team, it's continuously monitoring our database during or before an hour of the match. So if there are any problems we are always proactively communicated over it. So this is how we have used Managed Services to our go-for. We can really rely on Managed Services and we are sure that if some things go down, someone is always there for us for support on these cases. That's it. Thank you guys.

Aveekshith Bushan:

Awesome, thank you, Atul. Maybe what we'll do is we'll take one or two questions from the audience. I mean we have an interest of time, right? While we're doing that, I request everyone to fill in the questionnaire and also if there's any question as well, please feel free to fire it. If we can't answer your questions, we'll make sure we send it to you offline. Let me take one or two randomly guys. So the first question is, are you running any aggregate queries on top of Aerospike and Paddle?

Atul Dusane:

No, we are not running any aggregate query. We are just using it as a key store. So like I just said, the key is like a user ID, and for that corresponding, we store a map of our teams. We're not really using it as an aggregation.

Aveekshith Bushan:

Sure. And second question, what kind of tests were performed before choosing Aerospike? Were any other alternatives considered? If yes, why did you choose Aerospike?

Atul Dusane:

Yeah. We have been using Aerospike basically, the use case can be suffice to the RDS also, or it can be suffice by Casandra also, but then it has to be very battle-tested. We did not find a database which suffices strong consistency as much as Aerospike does and strong consistency was one of the critical aspects for us. So this is how we finally went into Aerospike.

Aveekshith Bushan:

Awesome. Other question is what's the replication lag between active and passive during merchant sync? Were there any SLAs? What's the failover switch lag from active to passive?

Atul Dusane:

Yeah, exactly. This is why... Okay. So, guys, you have XDR so XDR is actually syncing active and passive always. Okay, but we are not using XDR because we have something which we cannot use XDR because we want to switch the cluster immediately. In XDR, whenever there is a failure or network issue, there will always be some lag in syncing from active to passive. So we cannot afford this lag. We want to switch the cluster immediately within seconds we want to, if we get 5% of error, we want to switch the cluster immediately. We cannot delay it.

So that is why we have to switch the cluster immediately. If we see 5% of error, 2% of error, we'll immediately switch the cluster. And why we are not using XDR is because let's say we do a cluster switch and the user before the data is synced to the passive one and user tries to see the data and it does not found it causes a trust issue. So to avoid such scenarios in clusters switch we are not using XDR. And about the lag, as I said, there is absolutely no lag. We will just switch it immediately if we see any errors. [inaudible 00:57:54]. So this is the advantage of having this cluster switch manually rather than going by XDR.

Aveekshith Bushan:

Got it. One last question. We have 30 seconds. Can you describe your Aerospike cluster? How many nodes and how are they sized?

Atul Dusane:

So basically, this is the first time that we went to, and that is why we went with a higher number of nodes. We went with that 21 nodes of cluster, it's on R516X, and we are scaling down to 15. And probably we may scale it down further, but this is the first cycle so we went with 21, and with the normal load it is with a 15-node cluster. And Aerospike Kafka Connect we are using three nodes of cluster.

Aveekshith Bushan:

Great. So thank you, everyone. We'll answer your questions offline to those we couldn't answer, but Hammy and Atul, great talk. Thank you very much once again and let's keep in touch. Stay safe. Thank you everyone.

Hammond Pereira:

Thank you.

Atul Dusane:

Thank you guys.

Aveekshith Bushan:

You answer the questionnaire everyone else, thanks for joining.

Atul Dusane:

Thank you.

Aveekshith Bushan:

Cheers. Bye-Bye guys.