MDB and PVTL

@IRdoc - really interesting blogs from fortnite. Usually companies don’t release the details of their issues.

A couple of things I noticed right off. First of all they are using version 3.2 of Mongo. The current version is 3.6, and there are enormous performance improvements in 3.4 and 3.6

They had big issues in scaling across all sorts of different areas based on their hypergrowth of user numbers. Mongo cache failures was a key issue, but also connection pooling and Memcached, which is a distributed cache solution. They are also finding bottlenecks in authentication. What is surprising here is not the Mongo failures - writing, sharding, authentication, and connection establishment are all areas that will be bottlenecks when the system is stressed. What is surprising the cache failures and especially Memcached failures - this is an indication that their entire system design was not built to support the scale they are seeing.

Certainly part of their problem was in their own code - they were using Mongo for ephemeral data, which should be handled by an in-memory process. It looks like they also screwed up their sharding. This is very important to get right, or you can significantly affect your db performance.

One good thing is that Mongo responded immediately to their requirements (and hopefully charged them for it also). We’ve flown Mongo experts on-site to analyze our DB and usage, as well as provide real-time support during heavy load on weekends

Over all, I don’t see any red flags around Mongo scaling yet. It looks like a group of developers who developed a good solution for reasonable scale, and then got overwhelmed.

It’s been an amazing and exhilarating experience to grow Fortnite from our previous peak of 60K concurrent players to 3.4M in just a few months, making it perhaps the biggest PC/console game in the world!

At the scale that they are at, any data solution would need customized tuning, and they would have to architect their data design, caching, and sharding for the usage they are seeing. Mongo can do all that, and also has an in-memory database that they don’t seem to be using to handle their user accounts.

This would only be a red flag if they find they can’t succeed, and decide to change databases. Thanks for bringing this to my attention. I might get a Mongo expert on my team to look at this, and maybe chat with the Fortnite guys to see how things got settled out.

25 Likes