SpatialOS

Company

Careers

}

Introducing the upgraded Runtime: the new heart of SpatialOS

Today, we’re incredibly excited to introduce the new SpatialOS Runtime. The Runtime is the heart of SpatialOS. It is the foundational system that leverages distributed computation so that you can exceed the limitations of a single game engine or server. Given the importance of this core system, this upgrade marks a significant milestone for the SpatialOS platform as a whole.

We will be rolling out the new Runtime in steps, initially as opt-in releases. This enables you to verify the performance of your existing projects in the new environment. In this post, we explain what’s changed, the benefits of the new Runtime, and how you can opt into the upgrade today – starting with the new bridge architecture.

  • The first upgrade step is the new bridge architecture, which is available now. You can find more details about this change below or, alternatively, you can get started now and opt in here.
  • The next step is the new load-balancing system, which will be available in around a month.
  • The final step is the new entity database, which will be available in the upcoming months.

 Get started now with the new bridge.

We will make additional announcements when each of these upgrade steps is available. In the meantime, we want your feedback! Please let us know your experience in the forums.

But what is the Runtime?

As we said earlier, the Runtime is the foundation of a game running on the SpatialOS platform. It enables you to leverage distributed computation for your games by “stitching together” multiple game engines and servers.

More specifically, it:

  • Stores all live entity data.
  • Allows workers to make updates to entities in the game world.
  • Routes information between workers, such as commands and updates.
  • Handles interest management, by synchronising worker views with the true state of the world.
  • Dynamically distributes simulation work among the available workers
  • Manages starting and stopping server-side workers to simulate the physics, game logic, the AI, and so on, of the game world.
  • Coordinates saving Snapshots, the offline backups of entity data.

What are the benefits?

The immediate benefits of the new Runtime are mostly features related to performance and stability. However, we’re working on features in our development pipeline that focus on new capabilities that are only possible using the upgraded architecture.

Immediate benefits after completing the three-step upgrade

The full set of benefits will only be available after completing the upgrade. However, with each step (except the first, which is a compatibility upgrade) you will get a few new features.

    • Robustness and performance improvements: Compared to the old architecture, the new architecture is more resilient to being overloaded and ultimately enables more complex game worlds.
    • Easier scaling to larger worlds: Scaling up to large world sizes no longer involves tweaking system parameters like chunk sizes. Finding the right values for these parameters has been an issue in the past for many of you; this change simplifies the amount of work required.
    • Graceful degradation: It’s plausible that a region of a game world could become overloaded, e.g. under very high player density, slower clients could no longer keep up with the update rates. In this scenario, the upgraded data distribution system will merge consecutive component updates to lower the observed component update rates so that clients can keep up.
    • Load balancing hysteresis: The load balancer now supports hysteresis for entities moving between workers. This means an entity rapidly moving back and forth across the boundary between worker regions can be configured to have a single worker remain authoritative over it, instead of suffering from ‘authority thrashing’.
    • Worker replacement: We will expose the API of a service to allow the replacement of a specific instance of a server-side worker with one running on your own machine. This allows for live debugging of issues with server-side workers.

Features in our development pipeline

    • Rich interest management: We will be making the tools used to express worker ‘views’ of the world far more powerful, and also allowing you to make better use of the available bandwidth.
    • Entities without position components: We will be removing the constraint that all entities in the world require a defined position, and adding support for querying and interacting with entity data that’s not based on position. This adds support for game systems that are not spatial in nature (e.g. configuration entities, auction house or inventory systems).
    • Better scaling and cost effectiveness: The performance of the new entity database and its data distribution model allow more players, higher update rates and more entities for the same amount of computational power.
    • More configurable dynamic load balancing: The existing out-of-the-box, dynamic load-balancer did not generalise well to certain kinds of workers. The new load-balancing system is much more flexible, enabling better load-balancing of a broader variety of game worlds and worker performance characteristics. We’ll be adding more strategies and configuration parameters, which will give you much more control over how the load balancer behaves.
    • Support for non-spatial load-balancing strategies: We’re adding support to distribute load in the world in ways that aren’t based on dividing up space. This enables sharding of game systems that need to be distributed among the whole player base – for example, chat channels.
    • User-space load-balancing strategies: The new load-balancing system we’ve built is flexible enough to allow load-balancing strategies to be fully implemented in user space. Users seeking to get the most efficient worker allocation based on game-specific knowledge will now be able to write their own high-level load-balancers.
    • Worker port forwarding and shell access: We will be providing a way to configure and open SSH tunnels to server-side workers to facilitate remote debugging and profiling. This allows for live-debugging of issues that are difficult (or impossible) to reproduce using workers running on your own machine.

What’s changed?

The new architecture is characterized by changes to three core components: the new entity database, the new load-balancing system and the new bridge architecture.

First, the new entity database. Whilst the Runtime has always been responsible for storing the state of entities, we have now split this into a separate software component, which we call the entity database. Simply, this is a sharded, scalable database for storing live entity data. We will cover these changes and their consequences in a subsequent post when this upgrade step is available.

The second change we’re making is to worker ‘load-balancing.’ This determines which workers should be authoritative for which components on which entities, whether more server-side workers should be started or stopped, and the recovery or restart of crashed workers. Our new load-balancing system is due in around a month and we will cover the changes in a subsequent post when this is available.

aa8b1d88-runtime-diagram

The new bridge architecture

That said, the first change users will experience is to our bridge architecture. A bridge is a connection between a worker and the Runtime. Its purpose is to provide an up-to-date view of the world to the worker and to allow the worker to modify certain entities within the world.

The original bridge architecture was based on a message pipeline. Updates from within the Runtime were filtered into a stream of data to send to a worker. This was performed relatively well, but struggled to support more advanced features such as component-level diffing, update/bandwidth rate-limiting, dealing with stale data, and general robustness for poor connections.

The new bridge architecture is based on a ‘diffing’ model: the bridge holds two ‘views’ of the entities in the world that the worker is interested in: one view as they exist in the new entity database, and one view as they are seen by the worker. The bridge is responsible for making the two views ‘consistent’ with each other, by figuring out which updates to send to the worker’s view to make the two views agree on the state of the world.

The new approach is more flexible in terms of making various improvements in the future. But, fundamentally, it’s built to perform better with the new entity database.

Start now

Click on the link below to start today or discuss in our forums and Discord.