A Guide to Audio Processing and FX For Podcasting
When you listen to your favourite podcasts, what is it that draws you in the most? A gripping story? Chemistry between hosts? Compelling research? While these elements are undeniably important for a great podcast, there’s one thing that can make or break a show, something that is arguably more important than anything else – sound quality.
There are many factors that influence how a podcast will sound. We’ve covered a few of these in our previous article, How to Get the Best Audio Quality Out of Your Podcast, but here we want to focus specifically on audio processing and effects. While the basic elements of sound production (your recording space, microphones, preamps, mixers and interfaces etc) should never be overlooked – as these are what lay the foundation of any great recording – audio processing and effects are what make a podcast shine. All professional podcasters will use them to some degree. Below we’ll take a look at the most common and most relevant to podcast production.
In a nutshell, equalisation – or EQ – is the process of boosting or cutting the volume (dB) of certain frequencies in an audio signal in order to manipulate the overall tone of a recording. It is one of the most ubiquitous processes in sound production and is an important step in mixing all kinds of audio for all kinds of mediums.
When you record a sound, the audio signal is made up of layers of different frequencies. These sit on a spectrum that ranges from 20Hz to 20,000Hz (the audible range for humans). The lower frequencies, from 20-250Hz, are bass frequencies; the midrange – or ‘mids’ – is comprised of frequencies from 250Hz to 4kHz; the upper frequencies range from 4kHz to 20kHz.
Adjusting EQ will have a different impact on the tonality of the audio depending on what frequencies you boost or cut, and there are infinite ways you can sculpt a signal. Don’t let this deter you though – there are simple, steadfast rules and types of equalisers you can apply to achieve a certain tonal quality with the click of a button or turn of a knob.
Different Kinds of EQ
Equalisers come in many different formats – from software in your Digital Audio Workstation (DAW), to knobs on your mixer, to switches on your microphone (you’ll also find them on guitars, guitar amps, sound systems, studio hardware like preamps and compressors, and many more music-making tools and audio devices). The most commonly used equalisers in podcasting are those found in DAWs, such as ProTools, Logic or Audacity, and on mixers or consoles.
Most basic mixers will have three EQ knobs for each channel – one each for lows, mids, and highs. Some may have just one or two, while more advanced consoles, like those found in professional studios, may have more. When at noon, these knobs are at unity, meaning there are neither boosting nor cutting the frequency. Turning them counterclockwise will cut the frequency while turning clockwise boosts it. In most cases for podcasting, these three knobs are all you'll need to sculpt your voice the way you’d like.
If you are doing your EQ’ing in a DAW, you will likely have many more options to work with. All DAWs come pre-loaded with some kind of EQ software, and there are tonnes of incredible aftermarket EQs out there if you want to fork out some cash. These will allow you to adjust what frequency you need to boost or cut with pinpoint accuracy, add filters to curtail unwanted frequencies at certain cut-off points (low shelf, high shelf, low pass and high pass being the most common), and use presets designed specifically for different vocal types and applications.
Logic Pro X comes standard with a great Channel EQ plug-in. Here is a preset for cleaning up a spoken voice recording. Notice the cuts around the 20-250Hz range and the slight boosts around the 1-20kHz range.
Vocals on The Frequency Spectrum
Speech primarily occupies the 85-250Hz range on the frequency spectrum, meaning it is quite bassy – so this is generally where you will need to make your EQ adjustments when podcasting.
A high pass filter is an effective tool for shaping vocals for podcasting. This will only allow frequencies above a certain cut-off point to pass through the filter, attenuating lower frequencies. The result is a crisper, less ‘woofy’ vocal recording.
Vowels, which are relatively pronounced sounds in human speech, typically occupy the 350Hz-2kHz range; while consonants occupy the 1.5kHZ-4kHz range. You may want to make adjustments around these frequencies if you are noticing anything undesirable in your recordings. Sibilance – the grating ‘s’ and ‘t’ sounds in vocals that stick out in a recording like a sore thumb – typically live in the 4-7kHz range, so you will likely want to adjust here too. But be careful, this top end is where your vocals get their sheen from – it’s a delicate area that you don’t want to detract too much from. Thankfully there are tools that can help tame sibilance, which brings us to….
De-essing is the process of curtailing sibilance in a recording. You can do this manually – many audio engineers swear by this process, claiming it sounds more natural. However, this is time-consuming and for the everyday podcaster, it’s probably just not viable. Especially if you are pumping out an episode every day or so. That’s where de-essers come in.
De-essers are processing tools designed specifically to eliminate the harsh frequencies that create sibilance. They usually come in the form of plug-ins, and many DAWs come stock with a de-esser, ready to be used. They are also found in hardware formats, in channels strips or, as in the case of the RØDECaster Pro, as an onboard feature on a console.
A de-esser is essentially a compressor that reduces the volume of a certain frequency band every time it is detected. Depending on what kind you are using, you may be able to control parameters such as the cut-off frequency (which controls what frequency triggers the de-esser), how much the frequency is cut (usually labelled ‘amount’ or 'depth'), how much gain can pass through before the de-esser is triggered (usually called ‘threshold'), the speed at which the de-esser is triggered ('sensitivity'), and more.
The DigitalFishPhones SpitFish is a simple free de-esser. You can tune the frequency cut-off point anywhere between 4-12kHz.
Using a De-esser
Usually de-essers are a ‘set and leave’ kind of tool. If you are used to recording your own voice, you will likely know where sibilance occurs, meaning you can set your cut-off point where it needs to be and forget about it. If your podcast regularly features guests, you may want to tackle sibilance in post-production. This is a process of hunting down the harshest sibilant peak in the recording and setting your frequency cut-off there. It’s not a majorly complicated process and can make a world of difference to the overall quality of your podcast.
Compression is another one of those ubiquitous audio processes used in all facets of audio production. It’s a broad topic that could very well use its own article (or book), so below we’re going to be talking specifically about using compression for recording speech.
What Is It?
In essence, compressors reduce the dynamic range of an audio signal by reducing the loudest parts and boosting the softest parts. They are generally used to make a recording sound clearer, louder and more natural without adding distortion, and for this reason are a go-to tool for audio engineers in all fields. That being said, they can be over-used or used incorrectly, and can quickly ruin a recording.
When it comes to tracking speech for a podcast, adding compression can make a recording sound punchy, polished and professional. However, heavy compression can make a voice sound lifeless and unnatural. It’s generally a subtle effect but can be overdone quite easily, so having a grasp of the basic controls and what they do is important.
The Logic Pro X Compressor plug-in has controls for threshold, ratio, gain, attack, release and knee - all you'd need to sculpt your vocal sound plus more.
Common Compression Controls
Compressors will usually have the same set of adjustable parameters, but depending on how advanced the effect you are using is, some may be missing (either pre-set or not incorporated at all), or there may be more controls to play with.
These are the common parameters you’ll find on a compressor:
Threshold – The volume (dB) at which the compressor is triggered. A lower threshold will result in a subtle effect, or if it’s too low it may not present at all. If it is set too high, the signal will sound over-compressed and ‘squashed’.
Ratio – How much the signal is being reduced in volume compared to the input volume. This is expressed in the format [X = output volume]:[Y = input volume] – for example, if the ratio is 2:1, the output volume will be two times quieter than the input volume. When recording vocals for a podcast, between 2:1 and 4:1 is generally where you’ll want to be working.
Input Level – The level of the signal that is fed into the compressor.
Output Gain – As compression is essentially a gain-reducing process, the output signal may need to be boosted to get it back to where it was when it was fed into the compressor. This is sometimes called ‘makeup gain’. Typically, you’ll want to make sure the output level matches the input level (most compressors will have level meters side-by-side so you can compare them easily).
Attack – The time at which the compressor starts reducing gain after it crosses the threshold. When recording vocals, you’ll want to set your attack quite fast to avoid sounding unnatural.
Release – The time at which the compressor stops reducing the gain after it crosses the threshold. If this is too fast or slow, it can make a vocal recording sound strange, so find a good middle ground.
Knee – How smooth the transition is between the compressed and un-compressed signal. This won’t have a huge impact on your vocal recording, but the higher the knee is set, the smoother the transition is, resulting in a subtler, softer effect, and vice versa. A soft knee works well for vocals.
In an ideal world, every podcaster would have a comfortable, professional space to work in, with studio-grade soundproofing, the best gear money can buy, and tea and coffee on demand. But as you very well know, this is often not the case.
Compared to radio, podcasting is in many ways an amateur’s game – anyone can do it, even with the most basic gear. And this is the magic of it. But this also means many of you are working in far-from-ideal spaces, with everything from traffic noise, to reverb from your floorboards, to your housemates talking on the phone threatening to ruin your recording. How do you manage all this unwanted background noise?
There are a few fundamental things you can do to help, and we covered these in our How To Get The Best Audio Quality Out of Your Podcast article. But what about in pre- and post-production? Using a noise gate is a simple, effective solution.
The Logic Pro X Noise Gate plug-in has controls for threshold, attack, hold, release and level reduction, plus sidechain parameters, which are useful for ducking effects (more on that soon).
What’s a Noise Gate and How Do You Use It?
A noise gate is a dynamic processor that controls the content of an audio signal based on the volume of what is being recorded. A relative of compressors and de-essers, noise gates are often used to remove unwanted noise in a recording by setting a gain threshold at which anything underneath it is removed or reduced in volume.
Think of the threshold quite literally as a gate: if you set the threshold at -40dB anything louder than this will be allowed through, however, for everything quieter the gate shuts. See how this might be useful for reducing background noise?
The controls on a noise gate are similar to that of a compressor:
Threshold – Sets the level (dB) at which the gate closes.
Attack – The rate (ms) at which the gate is shut after it is triggered.
Hold – The amount of time (ms) the gate is closed.
Release – The rate (ms) at which the gate opens after the threshold is reached again.
Range (also called floor or level reduction) – Controls how much of a signal is let through the gate once closed. If this is set to zero, no sound will be let though; as you increase it, more and more will be let through; fully dialed, the gate is essentially open. This control may be useful if you don’t want to completely eliminate background noise, but just reduce it, making your recording sound a bit more natural.
Ducking – also known as sidechain compression - is an effect that’s used in all kinds of audio production, from EDM to radio broadcasting.
In a nutshell, ducking reduces the level of an audio signal when it is in the presence of another audio signal. In podcasting, it’s an extremely useful tool to ensure the host's voice is always more present than guests and is particularly useful for reigning in rowdy or overbearing interviewees.
Setting up sidechain compression is a relatively complex process that involves placing a noise gate on the channel(s) you want to ‘duck’ and feeding the sidechain of the gate to the track that you want to control the others (i.e. the host’s). Not all noise gates have this functionality.
The RØDECaster Pro, one the other hand, has a handy ducking tool pre-loaded onto its powerful on-board effects processer, read to go with touch of a switch. Handy!
The RØDECaster Pro: An All-In-One Podcasting Solution
In fact, all of the audio processing and effects mentioned above are packed into the RØDECaster Pro, the only console of its kind to offer such a wide range of podcast-ready features.
Compression, de-essing, noise gating and ducking are all accessible for each channel (ducking only for channel 1). The on-board APHEX audio processers (found in top broadcast studios around the world) offer further clarity and booming vocal presence. There are also presets crafted specifically for certain voice types (deep, medium and high; soft, medium and loud), giving your voice the broadcast quality you’re looking for. It is truly professional podcasting made easy. Find out more here.