Spilt Milk Studios are currently working with Improbable’s SpatialOS to create Lazarus, a top-down massively-multiplayer shooter set in a universe that can be developed, built up and then resets every week. In this dev diary, we talk to Andy Grier and Tom Johnson who are creating the audio for its persistent universe.
In our previous diary, we saw how the Lazarus team has dealt with the unusual issues of scale that come with developing on SpatialOS. However, it turns out that a large, persistent, constantly multiplayer world also creates new challenges for sound design. So say Andy Grier (DiRT 3, Call of Duty: Online, Guitar Hero Live), who is designing the audio for Lazarus, and Improbable’s own Tom Johnson, who has composed the game’s soundtrack.
Andy was quick to realise that the content demands of a SpatialOS game were unconventional, even though his tools are familiar. “The philosophy is a bit different,” says Andy, “because SpatialOS allowed us to make a game world the size of Wales, which is fantastic. But fundamentally at least Lazarus is still just a game being made in Unity3D with audio middleware called Wwise.”
That doesn’t mean the challenge is the same. “An analogy I use for how much audio there needs to be in Lazarus is one that I made for talking at universities,” Andy explains, “You take a C90 cassette tape that only stores 90 minutes of music. Now imagine that you have to drive to Scotland and it’s the only tape you are allowed to take with you for the ride. You’re going to hear that tape many, many times over.” Simply put, in a world the size of Lazarus’s players could quite easily become frustrated and lose immersion if faced with the same audio tracks and sound effects on a constant loop.
A simple solution to this problem would be to make more content. However, this would be enormously expensive – especially for a small studio like Spilt Milk. Even large AAA studios baulk at the cost of recording a lot of live orchestration for a game. As such, the industry has developed quite a few tricks over the years for getting the most out of the shortest possible recording sessions. There are no reasons why these tricks couldn’t work for an indie dev either and, happily, they can be applied to meet the scaled-up audio demands of a SpatialOS title too.
A good start is to split up your recording into its constituent instruments. If your budget only goes as far as capturing 20 minutes of orchestration, you could isolate particular parts and use them as motifs for different sections of the game. Perhaps you could associate a certain character with one instrument in particular? Maybe you could score a fast-paced section of gameplay with just a mix of the percussion you recorded and speed it up to add a little drama and intensity?
The use of techniques like this, along with stings and pulling out stems helped the audio team cut through player fatigue in Lazarus and kept the audio feeling fresh. They have also been looking into more radical ways of generating masses of music on a modest budget.
One of the current buzzwords across all game content-creation is ‘procedural generation’, AKA ‘proc gen’. This is a technique where developers program algorithms that combine content in exponentially-large ways, to easily generate masses of content – for example, it has been used to good effect in Spelunky’s levels, Roboblitz’s art assets and Left4Dead’s NPC AI.
Given that concept, it’s enticing to imagine how you could simply record the constituent elements of your sound (its core melodies, rhythms and instruments), store them in your game’s audio library, establish some kind of procedural generation algorithm, hit a button and let the game formulate countless “new” music tracks for itself. But in reality, this technology might not always produce a sound that is commercially viable.
“It was around the time Halo 2 that people began using MIDI orchestra samples to make procedurally generated music. But even now this technique is not where it needs to be in terms of the ultimate goal of ‘push button – computer makes music’,” says Andy. “Procedurally generated music is still a big thing in contemporary art circles because it seems to be good at creating a kind of 21st-century, chaotic modern style of music. But you need to be a bit more commercial about it with games.”
“The type of music that works in a video game isn’t going to be as experimental and weird as the kind that works for art cinema. It typically needs to have a bit of a structure and some rules behind it! For instance, Tom’s music for Lazarus is very much grounded in a kind of nostalgia for 1980s Saturday-morning kids cartoons. It has to that manga sword ‘shinnnng’ sound, so that its lasers sound massive. All these stereotypes need to be followed because otherwise, if one piece of the style doesn’t fit, you break expectations.”
From another angle, the emergent gameplay of a SpatialOS world also is a potential hazard implementing procedurally generated music. How could you ensure that the music generated by the system would suit the unpredictable situation that the player was in? There is nothing more immersion-breaking than entering a spontaneous dogfight accompanied by the gentle strumming of a harp. It seems that the battle for the soundtrack in Lazarus is one that Tom and Andy will be fighting right up until release.
Music aside, game audio is also filled with countless sound effects. Without being able to hear the sound of our own character or spaceship, how would we know that our controls were having an effect? Without hearing the approach of an enemy, how could we prepare to put our skills into action? This is a highly relevant set of questions for a SpatialOS game where all players and NPCs live in a single persistent world and player numbers in any of the game’s territories could suddenly increase unpredictably as large space battles play out.
“Let’s say, you are in a fire-fight with 20 enemy players shooting at you.” says Andy. “There are so many lasers and it’s all getting a bit overwhelming. You ask your buddies to help you in the chat and as opposed to 5 friends coming in, 40 friends come. It might seem great to you as a player but but from my point of view as a sound designer, it’s terrible!”
With so many different entities producing their own sound effects, it’s easy to see what he means. One approach to solving this is a technique called ‘capping’ which Andy uses to make emergent firefights less frenetic.“With capping, you can restrict the sound mix to just 10 lasers and even prioritise the sound effects of local players with each player’s laser receiving the highest priority so they can always hear what you are doing with everyone else filtered underneath,” he explains.
Andy’s second step is to use basic game attenuation. Based on the distance to the player, you can change the volumes of audio sources, filter effects and pan audio as needed. An example of this is the audio for the “pincer enemy” in Lazarus, that dynamically changes based on several factors.
“The first time I put him (the pincer enemy) in game, he had an earth-shattering, dumpster-truck crunching sound. With about two or three players in a scene and maybe one pincer enemy – it was fine to have a sound effect of that volume and intensity. But as soon a few of them start spawning, the scale became too much. I now have the potential to swap the assets that the pincer enemy type is using – whether it’s the big crunching sound asset or a smaller one – depending on how many players are active in the scene. That’s my kind of like joker in the back pocket. If i’ve got one player great – make the enemy sound beefy – if I have 10, swap. It’s almost like an audio LOD system actually.”
Before returning to his job of filling an emergent universe with sound, Andy has a couple of bits of advice for SpatialOS audio work.
“Take the time to understand what SpatialOS is – you don’t necessarily need to fully understand how it works, but a quick look at a talk on the tech by CTO Rob Whitehead at GDC17 or a read through the SpatialOS homepage could go a long way.”
“For rapid iteration, work locally off the staging build – because SpatialOS cloud-hosts your game, upload times for testing tweaks you make to audio in-game can sometimes take a long time. Working off a local staging branch, such as the one Steam provides for Lazarus, by passes this and allows quicker iteration.”
In the long term, our distributed cloud servers offers even more potential for audio design. The theoretically-unlimited computation of SpatialOS has potential for those areas of audio production that are currently processor-limited. Advanced reverbs, occlusion systems and so on, could become more feasible if they ran in the cloud. More than that, as Tom points out, a real-time 3D audio system in a cloud worker could generate reverberation and positional values. “The reason it doesn’t exist anywhere yet is that, beyond the local processor limitations, creating such a system would require a number of programmers, and deep integration in a game engine. So, in theory, some of this is possible. In practice, no one is doing it.” Not yet, at least…