The Case for Lossless Audio Encoding

July 24, 2022

The Case for Lossless Audio Encoding

A bit of history

According to Wikipedia, the compact disc was released in 1982, which is too early for me - but I take it they were considered advanced technology at the time and priced accordingly.

I was into computers in the mid 90s, though, before smart phones were a thing, and when basic MP3 players were still fancy expensive gadgets.

To give an example of the computers I was dealing with, I had:

1. An XT with a 10MB HDD and a 5MB Syquest drive.

2. An Atari ST with only floppy disks, and then later an Atari TT with a 30MB "Megafile" drive. I remember being very excited to be able to upgrade that to 80MB later.

An audio CD could hold roughly 800MB of data, so it was about 10x as large as my newly upgraded drive. Ripping CDs was possible, but it wasn't a practical thing to do.

I remember ripping a single song just to try it out. It took a long time and filled up most of my hard disk. Later, I tried to cut the bit rate in half (from 16 bits to 8 bits), and then the sampling rate in half (from 44khz to 22khz), which got it from around 50MB to around 12. I further converted it into Mono, which reduced the song to around 6MB. This meant I migth be able to fit 10 songs at this lower quality on my HDD, if I didn't want to store anything else. That's a crazy use for an expensive HDD.

Bear in mind that most people were still on dial-up Internet connections (if they had them at all), so it wasn't practical for most people to upload or download 800MB of data just to transfer a single CD.

Later when FLAC came out, this could be reduced to roughly 400MB, but was still larger than the entire capacity of most hard drives at the time.

When MP3 encoding came out, it did drastically reduce the file size, but in exchange for that smaller size a great deal of processing power was required to play them in realtime. In fact, I could download an MP3, but I couldn't play it in real time on my Atari TT. and decide it offline using mpg123 or such. This would write out a WAV file or similar (taking up a huge amount of the space on my hard drive) which I could then play with another utility. That meant running the decoder, waiting 10 or 15 minutes, and then playing the song, and erasing it when I needed to free up space. Not very convenient.

Rich fancy people with faster i386 machines could manage to play songs in realtime with software like WinAMP which was hand-coded in assembly language in the timing critical sections to get the speed needed. Even then, the computer would usually slow to a crawl as it used all of its might to decode and play the audio. The MP3 files still took up a relatively large amount of space, though, and so people tended to only keep a few songs they really liked, and also tended to use lower bit rates. The quality was relatively poor due to a combination of poor encoder software and the low bit rates used.

As time went by, the i486 and later PowerPC and Pentium chips came out, which could decode MP3 audio in real time with a little more breathing room. Encoding software got better and better, variable bit rate came into common use, and the relative importance of file size decreased as larger hard drives came into use. Due to these factors, the quality and size of MP3 files tended to increase.

These factors, combined with the slow increase in network speeds meant that listening to music on your computer moved from something that was possible for geeks, to something that was practical for many people. This was sespecially true on college campuses, since they tended to have ethernet networks instead of modems, and the computers were private rather than company owned.

Everyones knows the story of Napster, etc., but on college campuses local file sharing was probably a larger avenue for file sharing.

Eventually dedicated hardware (ASICs) for decoding MP3 compressed data streams was invented, leading to portable music players.

Since MP3 had patent issues, OGG Vorbis was invented to have an even more efficient, but more importantly patent free alternative to MP3. Other competitors similar to MP3 started to be used as well, including the ACC standard (backed by Apple and later Sony), and the ATRAC standard used in Sony's Minidic units and some early Network Walkman models.

Quality vs. Quality

But no matter which file format you used, there was always a quality trade-off.

There are, broadly speaking, two basic types of compression used for audio (and photos and video):

a. Lossless - A good example of this for general data would be the PK ZIP format. You can take a text file, compress it into a smaller ZIP file, and decompress it later without losing any data. There are other schemes such as bzip, lzh, and gzip, but the idea is the same. For photos there is PNG, and for audio there is FLAC and now Apple lossless. The advantage here is that you don't lose any quality at all, but the disadvantage is that the compression is not that great for audio.

b. Lossy - The theory behind mp3 that allows such good compression is that it literally throws away data that is less important. But which data is less important? Basically speaking there are weaknesses in the human hearing system that can be exploited.

Specifically, If you play two tones similar to one another (but of different volume) at the same time, then the louder one tends to dominate and mask the softer one such that you can't hear it well.

If you can't hear it anyway, then you can simplify the signal by removing that component. When taken to extremes, the result is obvious artifacts, but when done in a less extreme way, most people won't notice the difference in casual listening on inexpensive equipment.

Effectively, the quality of sound produced by so called "perceptual codecs" like MP3, AAC, OGG, and ATRAC is actually objectively worse, but as the saying goes "If you can't tell the difference, does it matter?"

In the early days of MP3, many people chose to use 128k encoding to save space, since it was "good enough". As more people started listing to devices like iPods and smart phones, and started buying better speakers for their PCs, some people started to notice distortion - particularly with certain types of music. Since hard drives were getting larger and networks were getting faster, people naturally started to gravitate towards higher bit rates, with most services that use MP3 standardizing around 320kbps. I good example would be the Amazon music store. (At 320kbps, it is very difficult to tell the difference between a losslessly compressed CD Audio and an MP3, even for someone with good ears using high grade equipment).

Another innovation to improve the quality of MP3 sound while keeping files small is a feature called "Variable bit rate". Originally most MP3 files that consumers came into contact with were "Constant bit rate", which means what it sounds like. With CBR the audio was broken up into frames, and there was a consistant bit rate used for each frame, over the length of the entire song. this meant that the same amount data was used to compress a very simple (or silent!) section of the song as a very complex part. Inevitably this meant that in order to get decent audio for the complex parts, the bit rate for the whole file had to be raised, and the file would grow in size. Conversely, lowering the bit rate might sound fine for most of the song, but leave part of it sounding distorted. VBR encoding fixed this by allowing the encoder to steal shift around the data to where it was more needed when encoding. This meant that more complex parts of the song would use more data, and simpler parts would use less. The person compressing the song can still controll the overall file size by adjusting a quality paramater - but for any given file size, VBR usually sounds better than CBR.

Note that CBR is still important for streaming due to the fact that for network connections you want to use a predictable amount of bandwidth.

Besides VBR, other add-ons have been incorporated to mp3 over the years, which is both good and bad. The good part is that the format continues to improve - but the bad part is that older players can't handle newer files. For example, my Zaurus MI-E1 can't play many recent MP3 files.

Despite the add-ons, MP3 is still not the most advanced format, however.

The AAC format is a bit better than MP3, and files offered by Apple and Sony (among others) use this format. AAC is also usually supported by hardware decoders.

OGG Vorbis is also better in terms of efficiency, meaning you can have smaller files or higher quality with OGG, compared with MP3. The disadvantage, though, is that it is supported by less platforms. For example, the standard msuic player in the iPhone can't play OGG files. Even if VLC on Android can do it, your phone might not have a hardware decoder, and so it might use up more battery life than MP3 or AAC would.

How to Decide which format to use?

This may sound like a simple situation, use AAC if you have Apple or Sony gear, use OGG is you will only be playing on the computer, and use MP3 if you aren't sure, right?

Sure, this can work, but what if you buy a new piece of gear later?

I myself have had this situation. I ripped everything to OGG because it offered the best quality for the file size, but then started using iTunes on the Mac, where Apple stubbornly refuses to support OGG. My Sony Walkman also doesn't support OGG with the native player.

Okay, so just don't use OGG then, right? Everyone supports MP3, that's safe, so let's use that!

Not so fast, the MP3 format has evolved too. Imagine you encoded at 128k CBR MP3 long ago and that was the only file you had left now. If you want to upgrade to 320k VBR, you need to go back to the CD and rip it again, if you can even find it. It may evolve more in the future, too!

Conversely, what if you wan tto rip high bit rate VBR MP3 to use on your fancy home audio player for your HiFi system, but you also want to be able to play the files on your old portable Rio or something that doesn't support those newer files?

Newer formats support things like surround sound, lyrics, high resolution audio, and all sorts of other features.

That sounds great, you can upgrade later, right? Well... not really.

There are two problems:

1. You can't add really things like high resolution or surround sound to files that don't have it. The information just isn't there.

and

2. Suppose you want to upgrade to a new format so you can add something like Lyrics that isn't supported by your current format. This means you have to transcode the audio in most cases. For example, to convert MP3 to OGG, you have to decompress the MP3 and recompress it to OGG. This means you will have all of the lossiness of the MP3 file, pluss the new losses from the data that the OGG compresser throws away. OGG might be a better format than MP3, but it will actually be worse. It's like making a copy of a copy.

This is of course also true if you have big fat files in your PC and you want to downgrade them to fit on your phone. Maybe you use 320k MP3 files on your PC and you want 128k files for your phone. Not only will 128 files sound worse than 320k files (of course), but 128k files converted from 320k files instead of from the original CD audio will sound even worse.

You never know what politics and technological advances may come in the future. For example, given that OGG is better than MP3, why should Apple or Google want to pay to license MP3? Maybe they will just stop supporting it completely if enough people are on AAC and/or OGG.

More subtly, perhaps Apple will decide that since they sell AAC files, they will quitely remove the hardware decoding support for MP3, and your batter life will get worse. In that case you will wish you had ripped AAC if you use an iPhone.

Maybe the magic format that takes up 50% the space of mp3 with twice the quality will come out next year! (doubtful...)

A more likely scenario: Maybe you upgrade your equipment now that you are working from home, and long to hear better quality than what you ripped already.

The point is, if you rip to a lossy format now, you are basically stuck with it forever.

The Solution?

Rip and store everything as lossless audio, at least on your PC.

FLAC was once in the relm of computer geeks, but has now become a consumer level technology. Sony and others sell FLAC files, many music players support them, and similar files are supported by others.

These days, computer SSDs are large enough that there's not really any excuse to need to save space. A single album at CD quality compressed with FLAC will take up 400MB or less, while even the most entry level computers have 256GB drives. That means if you rip 20 albums from CD, you are looking at 8GB - or less than 5% of your drive on an entry level machine. (As a side note, please don't buy a 256gb computer with non-upgradable storage, you will probably regret it).

Even if you have an enourmous music library and only a small drive on your computer, you can rip to FLAC, convert the files to something less space hungry to use on your PC, and store the FLAC files on a cheap external drive. You'll be glad you did when you lose your CDs in a move or find they don't work anymore because they are old and scratched up.

Perhaps an important thing to note is that you can rip to FLAC and still create MP3 or AAC for your phone, etc. You don't need to keep those files around either, since youc an always easily regenerate them from the FLAC files.

A bonus is that you can convert from say, FLAC to Apple Lossless Audio (or any other lossless ormat) without any loss either, since both formats are lossless. There may be differences in the metadata that different formats can handle, but most all of them can handle things like album art, track names, etc.

What about Streaming?

I know the younger generation is all about streaming, rather than spending hours of time carefully curating a collection of files on their PC - but I ask you try try this:

Hook up a decent pair of wired headphones to a decent CD player if you can find one and listen to the CD. Now try to listen to the same song from something like Youtube with the same pair of headphones. I think you'll be surprised at the difference.

(Alternatively, you could try the Tidal streaming service if it is available in your country). I've also gotten used to using streaming audio, but I recently compared the quality with my FLAC files using decent headphones and the difference was astounding. If you can't hear the difference, well then good for you!

Search This Blog

Shiruba on Technology

The Case for Lossless Audio Encoding

Comments

Post a Comment

Popular Posts

Stealing power from my Ebike for lights

Customizing Splash Screen on Intel NUC 12 Extreme in Linux