Spatial Audio

Tips for Capturing and Recording Spatial Audio

Illustration showing the outlines of two silhouetted heads with a pulsing purple orb in between. The heads are outlined in white against a black background, and are rendered inside of a white cube.

Illustrations by Yoshi Sodeoka

This post is part of our Guide to Creating Spatial Audio Podcasts series. Here, we share a range of approaches you can use to record audio for use in spatial mixes, and offer our reflections on what formats work best for different kinds of projects.

Introduction to Recording Audio for Spatial Projects

The way you record sounds for a narrative spatial audio project will have a big effect on how the piece comes together in post-production. While having high-quality recordings is essential for effective storytelling, we’ve found that you actually don’t need to record audio in a special way to integrate it into a spatial audio project. It’s actually quite likely that regardless of how you currently record audio — with your iPhone, a boom or shotgun mic or some other way — a similar approach will likely work for creating a spatial audio mix. 

We’ve found that we were able to use a wide variety of recordings to make spatial audio mixes, which can range from the audio that a producer captured with a dedicated mono or stereo microphone to tape that a reporter captured on the fly with an iPhone. That said, if you go into a project knowing it is destined for a spatial audio experience, you may want to consider using a dedicated spatial recording setup. These can be especially useful when a story takes place out in the field and establishing a dynamic sonic soundscape is important for the listener’s experience.

By Jon Cohrs, Chris Wood, Willa Köerner

Diagram titled "Recording with Spatial in Mind," which uses three columns to show three different methods for recording audio. On the left, an ambisonic microphone picks up audio waves from multiple sound sources, including a truck, motorcycle and cars stuck in traffic. In the middle, a single microphone picks up a mono recording of a single bird chirping. On the right, two microphones pick up a stereo recording of a plane flying overhead.

When recording audio for use in a spatial podcast, consider your end goal. If you’re recording ambiance out in the field and want to retain the full soundscape, ambisonics are a good choice (pictured at left). If you’re recording a single sound source, mono recording may make the most sense (pictured at center). And if you’re recording ambiance but don’t need the full spectrum, a stereo recording could work best (pictured at right).

Audio Recording Approaches for Spatial Mixes

Mono recording techniques require just a single microphone. 

By far the most common and well-known recording technique, mono recordings can serve as the backbone for spatial audio mixes. For the best quality, the mic should be placed as close to your sound source as possible and be situated in a way that minimizes background noise. 

  • Mono works best when you’re recording a single sound source (such as a voice), or for the directional recording of environmental sounds out in the field.
  • To record mono audio, place a microphone on or near your subject in a way that enables the main sound source to come through clearly, while minimizing background noise.
  • Be sure to keep the mic pointed at the sound source to avoid off-axis recording, which can negatively affect the sound.

Stereo recording techniques produce a left and right channel.

These are then typically routed to left and right speakers, or left and right earbuds. While many techniques can be used to capture two channels of sound, we’ve found that Binaural and ORTF formats work well for spatial audio mixes.

ORTF is a stereo recording technique that uses two cardioid microphones.

ORTF picks up audio in a way that emulates how we hear, as the microphones are placed 17cm apart and at an angle of 110°. It can work well within a spatial mix, as it provides a nice balance between immersive separation while also offering the clarity of directional recording. 

  • ORTF works best when you want a listener to feel like they’re in the middle of a sonic soundscape.
  • To record in ORTF, use two cardiod microphones placed close together, end to end and pointed outwards at a 110-degree angle roughly 17cm apart.
  • Keep in mind that the two microphones used should be as similar to each other as possible (or, ideally, identical).

Binaural recordings capture audio using two microphones placed within a dummy head.

The goal is to replicate the human hearing experience. This is done by using two matched omni-directional microphones placed in a dedicated dummy head (or the recordist’s own ears) to make the recording sound more lifelike.

  • Binaural works best when you don’t need to build a project that supports head-tracking, but want to add a bit of immersive spatiality to a stereo project.
  • To record binaural audio, position your binaural recording setup in a sound-rich space. The closer the microphones are to the sound source, the more directionality the recording will have.
  • Keep in mind that if you’re recording outside, you should use wind protection. Special earmuffs are available for this purpose, but use what you can if you’re in a windy environment (in a pinch we’ve used a beanie hat pulled down tight over the ears).
  • You’ll need an ear-mounted pair of omni-directional microphones or a dedicated dummy head with built-in microphones.

Ambisonic recording is a way to capture 360-degree audio from a single microphone.

Ambisonic recordings are defined by orders of resolution, where “1st order” ambisonic uses a specialized microphone to capture four channels of audio, and “2nd order” uses a specialized microphone to capture eight channels. Higher-order ambisonics (HOA) can create a greater degree of definition and detail in a soundscape, improving the overall sound.

  • Ambisonic recording works best when you want broad coverage of an environment to create sonic ambiance or a full 360-degree recording.
  • To create an ambisonic recording, you’ll need a specialized 1st-order or 2nd-order ambisonic microphone and 4-8 channels of recording inputs for a multi-channel microphone.
  • Keep in mind that ambisonic audio offers spatiality but will lack the clarity and focus often necessary for a voice or single audio source. 

While ambisonic recording techniques seem like a great choice for creating spatial audio because they record a 360-degree audio field, they can be challenging to work with because they record an overwhelming amount of spatial information. This is useful for ambiences and immersive field recordings, but often a clean mono recording can be more versatile, because it will include less ambient noise and reverb, making it easier to place in a spatial mix. 

Next Up: Mixing Spatial Audio

Once you have a good sense of how to record audio for spatial mixes, we recommend moving on to the next guide in our series, Approaches to Mixing for Spatial Audio. There you’ll find a detailed rundown of the approaches we’ve found most successful for producing narrative spatial audio mixes.

Related Projects