Audio Files

John Gruber, quoting Macotakara:

…[Apple] has been developing Hi-Res Audio streaming up to 96kHz/24bit in 2016.

The Lightning terminal with iOS 9 is compatible up to 192kHz/24Bit, but we do not have information on the sampling frequency of Apple Music download music. […]

Yet another indication that the analog headphone jack might be a goner.

In my commentary on this rumour, I pointed out the lack of perceptible differences between lossy and lossless audio, but I didn’t address the Lightning connector or this rumour’s intersection with the also-rumoured removal of the headphone jack.

These rumours are easily conflated for lots of reasons, but I think the main one is because of how confusing the world of digital audio is. There are myriad combinations of file types, compression formats, sampling rates, and bit depths, and that’s without exploring the various factors in between the sounds leaving the device and reaching our ears. Clarifying this rumour, however, needs some explanation.

Generally speaking, most music you have on your computer is likely to be in 44.1 kHz, 16 bit files, regardless of whether they’re lossy or lossless. The frequency rating, in kHz, is the sample rate. It partially determines the highest pitch the file can reproduce, which is precisely half the sample rate — a file with a 44.1 kHz sample rate can store frequencies up to 22.05 kHz. This is well beyond human hearing, which ranges between about 20 Hz and 20 kHz.

With age and noise exposure, the human ear’s sensitivity to high frequencies begins to deteriorate. You can test this by using Audacity and generating sine waves beginning at 20 kHz and reducing by 500 Hz or so until you can hear the tone. I’m 25; the upper bound of my hearing is about 19 kHz, which is pretty much normal. An older person who has spent a lot of time surrounded by loud noises — a musician whose career began in the ’60’s, for example — will have a much lower sensitivity to higher pitches.

Then there’s bit depth; typically, 16 or 24 bits. This determines the dynamic range of the recording — in simple terms, the difference between silence and the loudest non-distorting sound. When the volume is increased of a recording with a reduced bit depth — say, 8-12 bits — the “noise floor” will become more noticeable. The pervasive hissing sound you heard when playing cassettes? That’s the noise floor creeping in on an analog format similar to a low-bitrate digital recording.

In recording studios, 96 kHz or even 192 kHz sample rates and a 24 bit depth is not uncommon because recording engineers, mixers, and producers want the most dynamic range to play with, even if they don’t use it all. That gives them freedom to boost the volume of too-quiet recordings, mix loud and soft sounds together without hearing background hiss, and generally muck around as much as they like. It’s similar to how professional photographers use uncompressed RAW files while shooting and editing, so they have a maximum amount of flexibility and freedom.

Human ears can’t tell the difference between lower sample rates and much higher ones. As we’ve discussed, the sample rate affects the maximum frequency; as the standard 44.1 kHz sample rate of most recordings allows for the reproduction of sounds beyond the upper bounds of human hearing, the effects of significantly higher sample rates aren’t going to be audible.

Bit depth, on the other hand, has more noticeable real-world effects. A 16 bit recording allows for a 96 decibel dynamic range, but the upper (and very painful) limits of the human ear are around 140 dB. As dB is a logarithmic measurement, that’s far louder than 96 dB, and increasing a 16 bit recording to 140 dB would produce a noticeable noise floor. By contrast, a 24 bit recording allows for somewhere between 110 and 120 dB of dynamic range, which allows for more room between softer, quieter sounds and silence. Most popular music doesn’t make use of the full dynamic range of what 16 bit recording allows, but jazz, classical, and folk recordings tend to benefit from the increased range of 24 bit audio.

Of course, once you have a high-res audio recording, you need to play it, and this is where it all comes back to the Lightning port. There is no limit on what an analog audio port — like the headphone jack — can send, but there is a limit to the headphone jack in the iPhone. This is determined by the digital-to-analog converter, or DAC.

The one in the iPhone is, as best as anyone can guess, a 16-bit DAC and it probably supports a maximum sample rate of 44.1 kHz. Regardless of the audio format on the device, it’s going to pass through a DAC that supports CD quality audio, but no greater. The HTC One, on the other hand, has a 192 kHz, 24-bit DAC that outputs through the headphone jack.

I explained everything prior to that last paragraph first because I think it’s important to understand what those two measurements mean. As Macotakara points out, 192 kHz, 24-bit audio is already supported via the Lightning port, but it’s probably intended to facilitate better support for iOS devices as recording and mixing platforms. They might wish to expand its use and make lossless tracks available, which should appeal to jazz and classical fans unrepresented by other major music services.

But high-res audio formats do not necessarily foretell the demise of the headphone jack, nor are they something most people will be able to perceive. I really care about audio quality, but I keep virtually all of my music in high-quality lossy compressed formats because the difference is imperceptible. A 24-bit standard would be appealing, but only if modern recordings were mixed and mastered to expand their dynamic range and take full advantage of it. Maybe this is another push towards the demise of the loudness war, but I doubt it will make a difference. And, as I said previously, the loudness of modern recordings is what makes them sound like crap, not the format they’re in.