Hello [dot]cool glitchers,
This is the first in a series of 3 posts I'll be making that I hope will address a few common concerns and questions I see producers ask regarding mixing, mastering, and getting your tracks to sound consistent across multiple playback mediums. The focus of this article will be on that last point, and will be referred to from henceforth as "Translatabilty".
Why is translatability important when it comes to music? Personally, I find that translatability is most useful when trying to anticipating your music being played across multiple types of audio systems. Over ear headphones, earbuds, and speakers all have similar conceptual design, however, they have differing acoustics associated with their playback. Translatability can help ensure that any one of these mediums doesn’t drastically alter how your mixdown or master is perceived either in a positive or negative way. If you find that your mix sounds excellent on your home setup but when listening over a car's speaker system or sending your friend a link on soundcloud it sounds different, then some of the info below should be pretty helpful for you.
There's a few ways "translatability" is conveyed across multiple speaker setups. I've done a fairly rigorous amount of testing in my 11 years of music production regarding getting a mix to sound the same/similar across many systems. This ranged from hi-fi professional to enthusiast audio. The results of my testing have shown that the biggest concepts to keep in mind for translatability are: dynamics, frequency response, and the "sweet spots" of human hearing. In that order. Though, they all tie into each-other when it comes to audio translatability.
DYNAMICS:
Dynamics is huge when it comes to having your music translate across systems. I don't mean overall high/low signal of a track but more like RMS or an average of the dynamic range present within your music. If the average dynamic range is too low, the speaker setup you might be trying to test on could have trouble picking up stereo information, it could emphasize certain parts of a mix undesirably, and it could also warp the sound of the mix overall by just not having a very responsive frequency curve. For the sake of whoever reads this wall of text, I'm going to avoid diving too deep into speaker design. For a general producer's context: assume the worst.
The solutions are multiband compression and multiband audio processing. Alternatively, utilizing compression on each individual element which would bring up the average dynamic range of each track and the track as a whole as a result. Even adding 1-3dB of compression should noticeably increase how a track translates across speaker setups.
I do strongly recommend tweaking compression for each track element as parameters like attack times and release times are very important to maintaining what sounds like a very dynamic sound while also reducing the RMS to a point where it can be limited nicely. If you find yourself struggling to hear the compression, it could be due to a number of factors (such as the particular amp/headphone/speaker you are using) but an easy way to really hear what the compressor is doing is to add a temporary limiter with a gentle (1-3dB) gain reduction. You're most likely not going to want to keep this limiter on the track or element you are compressing but it can really help with emphasizing exactly what the compressor is doing which is very useful when tweaking attack/release times. In general, pushing a signal like this isn't great for mixing or mastering but can be great for getting an estimation on exactly how much gain reduction you'd like to apply via compression. If you find that you need to crank the limiter a bit more in order to hear the compressor, you probably need a bit more compression. If you find that the signal is too squashed sounding (basically flat, lacking depth, lows sound distorted or are distorting the input) then it may be a sign to dial back on the compression.
As a small aside: Saturation can also be useful for this purpose but note that the track will be perceived 'louder' due to the harmonics generated but this will most likely not help reduce dynamic range unless you are clipping. It's your perception of loudness that is changing but the overall dynamics may even increase depending on the signal you put through it. If you are clipping, you’ll also need to adjust the output so that the volume increase from clipping is accounted for by reducing the gain after the clip. Otherwise, you won’t actually be reducing the RMS of the dynamics; You’ll just be boosting past the clipping point. Clipping past 0.5-1dB will also generally introduce harmonics which can skew perception of loudness as well. Especially so if the input signal has a lot of low end energy eating up the headroom.
Multiband processing has been a particular favorite ‘translatability solution’ for me because unless you are dealing with hi-fi audio setups, the entire audio routing setup is most likely going to be, essentially, an analog multiband setup. There will be a crossover point between the sub, the mid, and the tweeter. By being close to that crossover (they are more or less around the same range across most systems; ~100hz and below for sub frequencies, with a >2,000 to 4,000 Hz crossover point for the tweeters) you are reducing the average dynamics, while also treating the audio in anticipation for those crossover points, which can result in mixes feeling skewed when playing across multiple systems.
By applying different compression across the sub, mid, and tweeter ranges for an entire mix/master, you are accommodating any variance that may be present in the speaker configuration while also ensuring that you don’t need to drive the speakers as hard to get the full signal. This is why turning up the speakers can help mitigate bad translatability; You’re giving more power to the speakers to help ensure that the weaker signals in the mix are hitting the sweet spots for the speaker’s impedance.
As a general example, for applying multiband compression on a drum bus: The kick and snare punch should both reside within the mid band, while the crossover point for the high band should hover around where the snare "snap" resides. When dialing in the high band crossover point, pay attention to where the snare or clap in your track starts to lose a bit of its energy. It will typically thin out a bit around 2k to 4k Hz and may even be visually indicated with a small bump when looking at the sound on a frequency spectrum. That's right around where you should put the crossover point, generally a bit above or a bit below that point in order to avoid any potential phasing or multiband crossover weirdness.
Frequency Response:
Each speaker setup will have a different frequency response curve. Full stop. By ensuring that your mix down, on average, isn’t completely flat and isn’t dominated by a cluster of frequencies in the mid-range, you will be able to achieve better translatability overall. I find that a somewhat skewed mix that peaks at the sub/low end and tilts slightly downwards towards the high end usually translates really nicely. The highs and lows should have relational loudness as well. Too much low-end makes the high end sound weak, too much high end makes the low-end feel buried.
As an example, here's a very good video on balance in mixing: https://www.youtube.com/watch?v=1DX_1c47s48 . Highly recommend checking out this entire channel (shoutout Kush Audio) because Gregory Scott is the MAN and Kush makes some absolutely incredible plugins for mixing.
The reason for this is that low/mids take up a greater amount of headroom than higher frequencies, so if you find your mix sounds “quiet” but has a good amount of gain reduction on the master, the mid and low end is most likely eating up too much headroom preventing the highs from translating nicely. If the opposite is true, the mix is super harsh and loud sounding while feeling sorta empty, then the high end is most likely getting most of the gain reduction and the mix should be tweaked a bit.
The “white noise mixing” technique (highly recommend looking up on a search engine of your choice) is really useful for determining this as it can help simulate low dynamic range (by providing an artificial noise floor that you monitor against) while also illuminating the loudest parts of the track (which will be the most consistent elements when testing against different systems).
In terms of monitoring tools, I recommend SPAN by voxengo for visually testing this as well as comparing against your favorite mixdowns that translate well. Don’t pay much attention to the peaks but definitely note at what volume the lowest parts of the tracks are and where they generally sit in the frequency range. Peaks just mean that they are most likely hitting the limiter but valleys and their frequency position indicate how much dynamic range is present for that section of the spectrum.
A nice video on SPAN and ways to tweak user settings to gain additional mixing insights: https://www.youtube.com/watch?v=iZrWMv02tlA
The HUMAN EAR:
A final note on how humans hear: we hear best between 1k and 5k hz. By soloing this band of frequencies and listening back, you can get a fantastic indication on how well your mix will translate across many systems. I wouldn’t recommend extended listening/mixing at this range but it’s a very good indicator for translatability. If an element sounds quiet in this frequency range during mixing, it’s gonna sound fairly invisible on a new system with a master on it. Mostly because of your familiarity with whatever tools you are using for playback during the track construction and mixing stage.
Your ear has already “adjusted” the sounds you are hearing coming out of the speakers for these ranges of sensitivity, so the contrast alone between two methods of playback will already skew the familiarity to the point where drastic changes may occur. I wouldn’t recommend trying to eq on an entire mix to solve these issues, though if you hear points of resonance I certainly recommend attenuating and then listening to the mix again as a whole. Sometimes you can get away with a 1-3dB shift, but you WILL be putting a massive hole in the mix right where your ear is most sensitive to change. I recommend attempting the mix again for more clarity in this range or eq’ing individual track elements until the balance has been restored for the 1k-5k hz range.
A final note on ears, ear fatigue plays a large part in the short term comparisons between systems but something to keep in mind more than fatigue is familiarity. Your ears get used to the tools you use most, which can be very useful for dialing in a mix down or listening to your mix critically but can also hamper your ability to hear what's truly different about the playback of a mix when listening on other systems. Definitely worthwhile to keep in mind, so make sure to give your ears some time to rest in between mixing and mastering sessions.
Thanks for Reading! If you have any questions, comments, or just want to learn more about audio, I highly recommend checking out our discord for glitch.cool. Our #audio-production channel has tons of great insights and goodies from myself and the glitch.cool community at large. Hope to see you there! :)