Using Only Sound Makes Investigation Even More Difficult - Dev on Unheard's Game Sound Design



Zhang Lei, NExT Studios Audio Director, has been designing game audio for over 20 years. He has participated in the development of games series' such as Rayman, Splinter Cell and Far Cry.

NExT 004's new title Unheard has now been online for two weeks. Its unique audio-based detective gameplay has received high praise from players, thanks to NExT's Audio Team. This article will discuss the sound design ideas used in Unheard and review the thoughts and choices we had when we were trying to create an immersive effect. If you are also interested in game audio design, this could be a technical article from which both of us could learn something. If you are an avid player of the game, you may learn the logic behind the game's creation and perhaps discover something you didn't notice when playing the game previously.


Two-channel Stereo? Surround?
3D Soundscape, Space Reverberation, HRTF, ASMR and others

Usually, the first thing to do when designing the soundscape for a project is to work out the dimension of it. In Unheard, it is the first thing that I gave a lot of thought to because the game just looks like a room map. 



A room map from the game


The sound design for a 2D game is nothing new because games gradually evolved from 2D to 2.5D and eventually to 3D. However, in Unheard, we are aiming at creating an immersive hearing experience for the player. This would require us to create a vivid 3D audible environment based on the 2D visual information, which could technically guarantee the player with enough information to progress through the game.

The first solution we tried was to put the area microphone on the little black person Phatom and set the direction of the microphone to the front of him. We also added different adjustments such as distance-volume attenuation. The result was okay but two issues remained:

  1. When the player's little black person was not facing forward on the screen, the things displayed on the left and right side of the screen would not correspond to the left and right side of the sound field. In this case, the players would need to take some time to figure out the positional relationship between what they hear and what they see. After some tests, we discovered that it would take quite a lot of time and effort for an average player to get the hang of it.
  2. If we kept turning our little black person around continuously, the sound field changed accordingly. The effect was indeed cool but it was not particularly good for players......


The microphone always points upward, regardless of character orientation


Therefore, we tried another solution which is also the one we kept in the final game. We still put the scene microphone on the little black character, but this time it always pointed upward on the screen. In other words, we basically pushed over the 2D scene you see on the vertical screen 90 degrees into a horizontal one. This avoids confusion caused by a mismatch between visuals and sound position.


Does the volume simply decrease as you move further away?

Defining the soundscape dimension and the basic principles are just a start. Since our goal is to create an immersive effect we must then recreate the way we hear sounds in real life. For example, sound sources get louder as we get closer and quieter as we move further away. It is a common technique used in games called Distance-Volume Attenuation.


Distance-Volume Attenuation Model


However, simulation is never that easy! The propagation of sound is a complex process and many factors interfere with the final outcome. In addition, different sounds change differently in the process of propagation. To establish an objective and perfect sound propagation model can be a good thesis topic for a physical acoustics doctor but certainly not for game sound designers. Our task is to use a set of relatively simple rules to create a credible effect. In Unheard, in addition to the Distance-Volume attenuation mentioned earlier, we also used different settings such as low-pass filtering, reverberation and spread. We also use different parameters for different sound sources.


Settings such as low-pass filter, reverberation and spread


Front, Back, Left, Right…

After achieving the effect of distance on sound propagation, we must achieve the effect created by the orientation of the microphone and the sound source position. After the VR boom, audio contents such as ASMR, which emphasizes realistic stereoscopic effect, have become very popular among users. However, the audio is created using dummy head recording. The data is pre-rendered and cannot be reproduced in a real-time sound field in the game. It has also prompted numerous software and hardware companies to develop related products and tools to enhance the in-game stereo effects.

After further analysis, we would see that most of the stereo enhancement effects in the industry are based off an HRTF algorithm, but I believe that the algorithm itself has a couple of issues:

  1. The algorithm is based on statistics, but auditory perception is prone to subjective judgments. Therefore, the final outcome may not satisfy everyone.
  2. The algorithm is older than me (HRTF algorithm was first used in the 1950s). Call it a straw man but the fact is that most current algorithms are just the result of research from several universities in the past few decades, their actual effects are far from satisfactory.

During the development of Unheard, we finally chose the solution which is based on surround mixing instead of the numerous two-channel stereo enhancement algorithms we tested. To put it simply, instead of using a method that is simple but has unreliable results, we used one that is rather complex but has reliable results.


The adjustable parameters of convolution reverb


Another important factor for defining a soundscape is reverberation. It gives the player a first impression on the details of the space the sound source is in. We used convolution reverb in Unheard because it increases the number of adjustable parameters and also enhances the characteristic of the environment, thus achieving our "immersive" design target.

The convolution reverb also has a fun feature which we mentioned earlier. It can control the ratio between direct sound and reflected sound based on the changes in distance. Imagine two people talking in a hall; if they are close to each other, they can hear each other easily, if they are far from each other, and their volume remains the same, what they hear would be reverberation and echo instead.


The ratio between direct sound and reflected sound


Two-channel Stereo or Multi-channel Surround


Habitually, the two audio formats are generally referred to as "Stereo" and "Surround". The main platform of Unheard is PC, considering that most PC users use 2.0 desktop speakers or headphones, our mixing work is mainly based on two-channel stereo, but we still spent quite a lot of energy to ensure that the game can perform well using a multi-channel sound system.

My thoughts are that if a player has purchased a multi-channel device, even if they did not make the investment because of our game, they did spend more than those who didn't purchase one and are therefore entitled to a different game experience. From another perspective, if a user respects and supports your work with actual actions, is there any reason not to repay him with the same respect?

For me personally, I'd prefer an Unheard experience with a multi-channel set-up. One time I was checking the version with a programmer in the studio and he was sitting NExT to the door with left surround speaker behind him. While testing, a door in the game was opened which made him jump up all of a sudden,thinking that someone had actually opened the door by him in reality... Unheard supports 2.0, 5.1 and 7.1 playback, you should try it out if you have the opportunity.


All the mixing was completed in the NEXT Sound Room


Voice Production
"You can talk too, so why do you wanna pay me to talk?" –Guo Degang

First of all, players and market acceptance aside, I would like to sincerely thank all the voice actors, dubbing directors and staff who participated in the making of Unheard from both China and the United States. Their professionalism gave this project the greatest foundation.。

You must be professional enough if you want to work and communicate well with other professionals! (That's right, you can't just infect them with your enthusiasm, excuse me...)

During the design, production and integration of the sounds in Unheard, we have learned the following things:

  1. Complete, clear and detailed character descriptions can help the team and the director to select actors more accurately before recording.
  2. The script must be carefully thought through before recording to prevent any last-minute changes to quantity and plot structure during recording as such changes could easily mess up the data at later stages and are also very unprofessional.
  3. Professional voice actors can usually dub for several characters,but while casting you should try your best to avoid characters dubbed by the same person in the same conversation.
  4. The recordings in Unheard are mainly plot-related. But before recording, we would ask the voice actors to provide different short phrases or interjections for each character using different emotions. These materials can help a lot in the integration stage. (If you ask nicely, they won't charge you extra for it, as long as they aren't too intense.)
  5. File naming should be standardized! File naming should be standardized! File naming should be standardized! The picture below shows how we name our audio files in Unheard. When we receive the post-processed data, we can immediately know the basic info of the file. After we finish organizing the audio data, the game designer can easily find the required event based on simple logic.

    Unheard audio file naming criteria

  6. Actively control the number of additional recordings. Recording is delicate work, the different states of the actor and even the slightest distance changes between the actor and his microphone could cause a significant discrepancy in recording data which cannot be corrected. In Unheard, we spent most of our time making these sorts of corrections. Since the industry status quo cannot be changed immediately, what we can only do is to try to keep the number of additional recordings to a minimum.


It's already late...

That's it for now, although there are still a lot of questions that remain unanswered such as "If you spent all your budget on VO, what about the music?", "What about using live streamers as voice actors" etc. But I must focus on another project or it will fall behind schedule!

Sleep No More and keep up the hard work!




Registered Address:29/F., Three Pacific Place, No. 1 Queen's Road East, Wanchai, Hong Kong