Challenge of Scaling Distributed Systems for Single Universe Games

1 / 40

Embed Share

Explore the challenges and complexities of scaling distributed systems for single universe games, as discussed by Jacky Mallett, a Distributed Systems Architect. Learn about the limitations, differences between large and small systems, and the intricacies of managing massive player interactions in virtual worlds.

saee853 Follow

Uploaded on Mar 18, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Scaling the Single Universe Jacky Mallett, Distributed Systems Architect jacky@ccpgames.com

Designing Single Universe games Known limits on scaling distributed systems. Why large systems are different from small ones. Is there a heart of darkness lurking at the core? Why there can t be a single, optimal solution.

Why is a single Universe such a Challenge?

EVE Online 360,000 subscribers 63,170 concurrent users (Jan 23rd 2011) single cluster (everyone in the same virtual world) Up to 1800 users in most popular solar system 1000+ users in fleet fights Over 200 individual servers running solar system simulation and support 24 client access servers One database server

EVE Tranquility Cluster Players connect to Proxies Dedicated game mechanic nodes. E.g. markets, fleets, etc. SOLS (solar systems) host the real time game simulation

Message Based Architecture Players connect to Proxies SOLS (solar systems) host the real time game simulation

Mesh vs. Embarrassingly Parallel Server Server Server Scaling is relatively simple, as long as it s a case of adding servers that have little or no need to communicate with each other

Communication Distributed Systems run on discrete units of computation, triggered by messages received on servers. Svc.getCharInfoResp() Svc.getCharInfo() Server Server Svc.getCharInfo() Svc.getCharInfo() Svc.getCharInfo() Svc.getCharInfo()

Incoming Server load from player updates Real Time Players vs Server Load 800,000 Msg/s @ server 700,000 600,000 500,000 1 msg/minute 400,000 1 msg/second 300,000 5 msg/second 200,000 30 msg/second 100,000 0 No. of Players

Network Communication Limits Information Capacity is the instantaneous limit on the total amount of Shannon Information a distributed system can handle Shannon Information = unique, single message E.g. Single remote procedure call

Where is the bottleneck? SOL Proxy SOL Proxy SOLs: Load O(Proxies * Broadcast msg/s) Proxies: Load O(Sols * Broadcast msg/s) Proxy It depends on the SOL : Proxy ratio. The number of broadcast/s dominates.

Broadcast Load

Network Communication Limits Where L is individual node link capacity with other nodes (group size) and N is total number of nodes in system : Strictly Hierarchical Topology scales - O(Lserver) Full Mesh scales O(L N) (Gupta & Kamar, Scaglione) Traffic worst case scales O(N2) as all the nodes try to talk to all the other nodes simultaneously.

Model of Information Limits Maximum Load with increasing players Insufficient Information Capacity Mesh: O(L N) : L = 40 Accessible to mesh architectures, not to hierarchical SingleServer: L = 40

Close up on Left Hand Corner Maximum Load with increasing players The one place where everything is possible the Laboratory Mesh: O(L N) : L = 40 Single Server: L = 40

Effect of increasing connectivity limit N(N-1) L N : L = 100 L N : L = 50 L N : L = 10

Conceptual Tools Define Group Size as the maximum number of direct communicators supported by available hardware. Group Size Limit: Link Capacity No. of Clients (Players) Group Size Limit: L N N(N-1)

Hierarchical vs Mesh Topologies Hierarchical Mesh Simple Complicated Easy to provide single point of control Hard to provide single point of control Synchronisation is what it does best Impossible to guarantee synchronisation Fragile single point of failure Robust hard to bring down all nodes Low Information Capacity High Information Capacity Cannot scale to large N Scales to large N Performs well with high latency communication Performs badly with high latency communication

Latency influences scaling. Message Latency time to send and receive Round Trip Time affects Message Frequency Computation latency how long it takes to process. But what this also means is that several different problems can have the same symptoms.

Using Hierarchy to Load balance Character Node Character Node Character Information was sourced by SOLS for their local players. Character Information is static, easy to index per character, and has a long latency communication profile > moved to its own nodes.

Advantages of Mesh Topology We can divert load with long latency profiles to other servers e.g Character Nodes, Markets, Fleet Management Nodes.

Using Mesh to Load Balance SOL SOL Jita SOL SOL SOL Topology change moved Jita from a hierarchical position to a mesh, thereby increasing Information capacity.

Fischer Consensus Problem Consensus agreement on shared values is a critical problem when it occurs in distributed systems. It is impossible to guaranteethat a set of asynchronouslyconnected processes can agree on even a single bit value. Impossibility of distributed consensus with one faulty process Fischer, Lynch, Paterson 1985 http://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf Consensus: The big Misunderstanding. Guerraoui, Schiper http://www.cs.cornell.edu/Courses/cs734/2000fa/cached papers/gs97.pdf

Origin of the Consensus Problem Fischer s Consensus problem showed that it is impossible to distinguish between a delayed message and a lost one Lost messages are equivalent to faulty processes Guarantee - Most runs will achieve consensus. But there is no possible protocol that can be guaranteed to converge The larger the distributed system and the more messages/hops involved the higher the probability of a failure or a delay.

Introduced by Message Handling Message from B { Messages from unrelated tasks Message from A Reality of today s computing environment is multiple layers of message queuing As performance degrades, message delays increase, and the probability of a consensus problem increases Very hard to debug in real time systems, since debugging changes timing.

Symptoms of Consensus Problems 1. 2. Sleep(1) - Reduces probability of delayed message Code complexity Locally analysing every possible delayed/lost message instance, and writing code to handle it Regrettably this merely introduces more occurrences of the problem In the limit the developer goes quietly crazy 3. Note: Even with theoretically reliable processes, guaranteeing consensus is non-trivial and requires multiple message rounds Which reduces available Information capacity

Solutions Consensus will usually occur, but can t ever be guaranteed The solutions are: Error recovery : The end user will recover Synchronization But, we know that synchronizing at a single point will cause potential congestion.

Jumping typically involves several nodes 30s to resolve all cluster calls should be plenty

Consensus Problems in Eve Number one cause of stuck petitions Technical Support Stuck Queue Game Master support for the the end user will recover solution. Software design can certainly minimize them, but beyond a certain point they have to be worked around. Solutions to consensus all involve different kinds of performance trade offs.

Solutions : Consensus, who needs it? It is possible to build systems that work with no consensus guarantees at all E.g. Google, Facebook. Sensor network solutions have been proposed which exploit the over supply of information relative to the problem space Raise the probability of agreement to as close to 1 as possible, despite the presence of failure Probabilistic consensus algorithms Generally speaking, these solutions can be made to work, but they typically rely on multiple rounds of message passing, that are expensive for large systems

Eve Solution: Cache Early Cache Often All nodes cache data in EVE SOLS cache to protect the Database from Load Proxies cache to protect the SOLS Clients cache to protect the Proxies Cached data is not synchronised, so at any given instance different nodes may have different views of some data Programmers can control relatively how out of date the view is

Solutions: Synchronisation Shared clocks literal synchronization Atomic clocks provide highly accurate synchronized time (Stratum 0) The reliability of timing information provides extra local information that can be used to resolve consensus failure. Practically, network time based synchronization has its own problems and works best with long latency systems. Lamport Timestamps Simple algorithm that provides a partial ordering of events in a distributed system Vector clocks Both ways of providing a logical ordering of events/messages in a distributed system "Time, clocks, and the ordering of events in a distributed system Leslie Lamport, 1978 http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html

Solutions: Design Consensus requires synchronisation Avoid the requirement for consensus where ever possible Critical will player s notice or care about inconsistent state? If consensus is required a single server is needed somewhere to handle it Where possible design consensus to be a long latency process Minimises probability of occurance.

Pod Kills in Last Hour

EVE Fleet Fights First players converge on a node Load is distributed as much as possible over other cluster machines. Sometimes they try to do this all at once.

Latency Analysis Typical per player load on the cluster is < 0.5 msg/s 30s Jump Timer is a game mechanic that provides long latency During a fleet fight this goes up to > 1 msg/s Dedicated nodes used for known hot spots Load balancing over evolution of game CPU load from fleet fight calculations has high computational latency Highly optimized by CCP s lag fighting Gridlock team. Load on proxies is engineered with spare capacity for bursts Traffic measured on proxies is well distributed and stable Non-real time traffic is hosted on separate nodes

EVE Fleet Fights For large numbers of players though, we re always fighting the real time limits.

Design Approach Calculate load as calls per second/player and roughly estimate the number of players How much CPU is each call going to require to process? How much Network bandwidth? Is the design embarrassingly parallel or can it be made so? Does it have high latency? Some form of hierarchical approach should work If not - Is the problem solvable at all? Mesh approach should work Don t try to solve the Fischer consensus problem! Can I change the problem and solve it at a higher level? Fortunately Games give us enormous freedom to do this.

Takeaway. Very large scale distributed computing is a design problem. Have a Model for the requirements of your system Multiple issues have the same symptom Limits apply at all levels of abstraction The only systems that scale arbitrarily are the ones that don t communicate with each other. Know the limits applicable for your application. Don t try to guarantee consensus.

Any Questions?

Challenge of Scaling Distributed Systems for Single Universe Games

Download Presentation

Presentation Transcript

Related

More Related Content