What is binaural 3D audio?

The form of our ears plays a big role in making us hear in 3D. This article covers the basics of natural spatial hearing, how to reproduce it on headphones, and how to create binaural mixes from scratch. 

Let's start with something to listen to. Both tracks below are exported from exactly the same mixing session. Except normalizing and cutting the beginning and the end of the tracks, there was no postproduction or mastering done after the export. The only difference is that the left video is exported with standard stereo panning and the right video with the binaural 3D sound. Both mixes have been made using the 3D sound production software MNTN — The Sound of the Mountain.

The ears

With the help of our ears, we can hear sounds coming from different directions. You can listen to the rustling leaves in the trees above, your footsteps on the asphalt, the voice of a friend walking beside you, and the motorway in the distance, which you're approaching. It's the way we naturally hear sounds existing in all three dimensions (3D), with two ears (bin-aural), and with no technology involved.

The loudspeakers

So what’s different when you listen to sound coming from any number of loudspeakers, such as your mono kitchen radio, home stereo, cinema surround, or a 3D multichannel sound installation? Nothing! For example, let's take the mono kitchen radio. The listener can turn his head in any direction, and he can always hear where the radio is placed in the kitchen. We always hear the natural, binaural 3D audio. But the range of possible directions for the sounds to travel in space when played through a loudspeaker is limited. Same thing goes for the home stereo: you can determine the loudspeakers' position in the room with your binaural 3D hearing, but the music played back is only spread between the two loudspeakers — a relatively small segment of the entire listening space.

The headphones

Everything changes when you put on your headphones: your natural ability for spatial hearing becomes seriously weakened. You can still have a feeling of sounds coming from the left and those coming from the right, and of sounds that are closer or more distant. But you lose the ability to distinguish between front and back, up and down. And you get the impression that all sounds are kind of strung on a string between your ears. Audio geeks call this in-head localization. The reason for the loss of spatial hearing when using headphones is that they neutralize the acoustic influence that the shape of your body, your head, and your outer ears have on the sound you’re hearing.

Another flattening effect is that headphones ignore the room acoustics. Depending to the physical characteristics of the room, any sound including those played back over loudspeakers creates a reverb. And you always hear the direct sound waves mixed together with the reverb of the listening room. Since the sound from your headphones only passes the ear canal, the acoustic „footstep" of the listening room doesn’t affect your hearing.

Finally, the music is spatially "locked" to your head and not to the external world: left always stays left in your perception, regardless of the direction in which you turn your head. In contrast, with loudspeaker playback, the spatial sound image is locked to the external environment, where the loudspeakers are located.

Luckily, smart people have found a way to make binaural 3D hearing possible even with headphones. The short story is: they figured out how to simulate the acoustic influence of your body.

Our body makes the 3D effect


And here comes the long story. In addition to the sound waves reaching our ears directly, we also hear the influence of our body. Before the sound waves hit our eardrums, the shape of our outer ears, our head, and our torso are causing acoustic reflexions and acoustic shadows. What we then hear, is a mix of direct sound waves, reflexions, and acoustic shadows, which makes binaural 3D hearing possible by nature. Now it's very clear why binaural 3D hearing cannot work when we put headphones on: we remove most of the spatial information „encoded“ by our body from the mix that reaches our eardrums — what remains is only "what's on the tape," or, the direct sound waves.

The invention of the
dummy head

Some decades ago, sound engineers discovered a remarkable effect: When you place miniature microphones in your ears and make a sound recording — let's say of an orchestra — and when you then listen to this recording on headphones, it sounds much like you were in the concert hall! This technique creates a rich and three-dimensional listening experience like the one you would have when listening to the original sound. But instead of your very personal ears, you can also use a mechanical head with “microphone-ears". On September 3rd in 1973, RIAS Berlin broadcasted the first radio drama which was recorded with the so-called dummy head recording technique: Demolition by Alfred Bester. One year later, Virgin Records published the album Aqua by Edgar Froese, which contains the two binaurally recorded tracks NGC 891 and Upland.

Another album is Flow Motion, from the German experimental rock band Can, published in 1976. In 2015, the Singaporean songwriter JJ Lin produced the pop album From M.E. to Myself, in which many tracks sound like little scenic stories (listen to track 5!).

However, as amazing as the dummy head recording technique is, its breakthrough never really happened. Some say because listeners were required to wear headphones. Others say it was because of conservative audio engineers who didn't want to change their technical workflows.

Things have changed a little bit in the meanwhile: many people listen to music on headphones. And although audio production has become much easier and accessible for everybody, the biggest disadvantage of the dummy head is that once you’ve recorded a session, things are set in stone; there is no way to change the direction and the distance of any instrument. Furthermore, you must rely on decent acoustics in the recording room. These are enormous constraints in audio production for most people, and music producers want to be able to go back to their mixes. Most importantly, the dummy head technique doesn’t allow for any artistic experimentation: it only documents reality. In fact, this was the a core argument against the binaural recording technique.

From 3D sound recording-only to 3D sound design

Digitalization made audio signal processing much easier, and sound engineers figured out how to retrospectively make anything sound like an original binaural 3D recording on headphones, or create a 3D sound composition from scratch. The magic key is a mathematical operation called "convolution." Without wanting to get too much into the techie stuff, let's say it's a way of adding a particular acoustic characteristic to any sound in your mix. It's like saying: "let the second voice sound like half left and a little bit elevated." Or "I want to hear the guitar from the right behind me, and the organ from just above my head."

The technical procedure for creating the binaural 3D hearing effect on headphones goes like this: Place a dummy head with "microphone-ears" in a quiet room without acoustic reflexions, a so-called un-echoic chamber, and put a loudspeaker right in front of it. Then play an acoustic measurement signal — a sine sweep — through that loudspeaker, and record the sound with the dummy head.

If you then digitally convert the recorded sine sweep (fourier transformation), you get something which sounds just like a crack. It contains all acoustic reflexions and shadows from the dummy head (simulating your body). This crack is essentially a filter and can be used in a convolution plugin to make your guitar sound as if you were sitting in this un-echoic chamber in place of the dummy head, listening to the guitar at the position where the loudspeaker was placed.

[SIDE TRACK: Some of you might have heard about the convolution reverb, which technically works pretty much the same way. You record a single hand clap (or a shot from a fake pistol) in a church, which you can then use in a convolution reverb to simulate the room acoustics of the church.]

You might be thinking: Great, I can now put any sound I want in this one position in front of me. But what about all the other directions? Well, you’d have to measure all of those directions too. If you think that it would take you a very long time to do hundreds of measurements, you're right. Nonetheless, some people did it because if you do it once, you can use it forever.

The binaural 3D audio mixing software

Now, here’s the twist: with hundreds of measurements, you'd need to prepare hundreds of tracks with hundreds of convolution plugins. Our 3D sound mixing software MNTN provides a user interface to move sounds around the 3D space intuitively. It does all the math, including some additional magic such as interpolation. So the only thing you need to do is to place your sounds, instruments, or samples in any position that you want to hear them from. It works in 360°, and above your head.

If you’ve never tried MNTN before, we encourage you to download the latest version from http://www.mntn.rocks/downloads and get a 30-days-free-trial. And for all of you who are already using MNTN: We took the binaural 3D audio algorithm a step further and just published an update with a significant improvement in sound quality.

What's next?

Apart from the techniques described above to stick to, there are a couple of more things that can improve the 3D effect drastically while producing for headphones, and you might already have heard about them: head tracking, individualized HRTFs, or the binaural simulation of room acoustics. Each of these subjects is worth an article of their own, which we will cover soon in one of our following postings. Stay tuned!