Author Topic: Music Tech Note: Converting to 22 KHz (for OSF Encoding)  (Read 7742 times)

Offline Sidhe Priest

  • Gold
  • ***
  • Posts: 157
  • Blackstar jockey.
    • Pyromania
Music Tech Note: Converting to 22 KHz (for OSF Encoding)
« on: October 24, 2010, 01:19:45 AM »
This little article deals with how to convert a music piece to 22 KHz with as little damage as possible.

First of all, a simple formula. Detail limit of a sampled format is sampling frequency/8. In scientific sampling, a sampled signal is considered accurate as long as the sampling frequency is 8 times the frequency of the signal to be sampled. This is because 8 coordinates are required to describe a sine wave: 4 positive, 4 negative.

Hence, by the way, you need 160 KHz sampling to accurately represent 20 KHz (so much for CD audio's claims of "22 KHz bandwidth" - 22 KHz, yeah, but for noise, not music, detail limit is much lower).

Thus: 44100 Hz (CD audio) divided by 8=5512.5 Hz. Enough for midrange, but there's already distortion in the treble and high frequencies. 22050 Hz (Descent 3 sampling format for music) divided by 8=2756.25 Hz. Not very detailed, is it?

So what happens when you go over the magical f/8 limit? Aliasing happens. Rectification distortion, foldover, etc. Everything becomes "squarish" (or more precisely, triangular). The minimum time frame also becomes 0.36 msec., which is slow (to human perception that sounds "stiffy", "woody").

And that tends to sound weird. Get an acoustic instrument record and downsample to 22050 Hz, then listen to what happens to treble and space harmonics. But never fear, for there are ways to smoothing out the damage.

The first tool is a lowpass filter. Depending on what's your editor, there might already be one built-in. Make sure it can offset by at least 24 dB, and set the lowpass to ~10200 Hz or 11025 Hz. That'll make a crude DF filter so the "space range" (everything between 10-20 KHz gives clues to space, like the shape of a cave's or concert hall's walls, etc.) gets muted, or completely cut off for a brickwall filter. The effect this may have on music is that it'll sound as if it were playing in open space, but reverb shape/cues will still be there (the size and density of a typical reverb are all mostly in midrange and treble).

Lowpassing at ~10 KHz will reduce space-mangling by removing or muted rectified harmonics. You can experiment with something like 5 KHz, but the less high-frequency harmonics (even rectified), the duller and woodier everything will sound. By the way, rectified harmonics are perceived as "no or little movement", "flat", "hollow", "still".

By lowpassing you compensate for the lack of detail by simply hiding the undetailed harmonics.

A good free VST filter is TOGU TAL Filter. Set it to the "Clean" preset, turn LFO off, and turn the frequency dial to about 2/3.

A previous version of the TAL Filter was MFilter - Moog VCF - and since it's been quietly put to rest, here it is for download.

The second tool is a maximiser, or "compressor" as they're sometimes called. A standard commercial tool is Waves Ultramaximiser, and there're lots of free VST plugins, but the one in use here is Blockfish.

Now, people out there all go crying over the "Loudness War" and how everyone should not be squashing dynamics, but there's a simple reason for mastering engineers squashing the dynamics of CDs, and it's the deficiency of the format. CD Audio is only 16-bit. Here's a little table:

Amplitude    Levels available
0dB               65536
-6dB              32768
-12dB            16384
-18dB             8192

So to keep some dynamic accuracy, you have to keep the dynamics within the top -6 dB, -12 dB, with the realistic limit being -18 dB. Drop below -18 dB and everything becomes cold, cold, cold as there's little voltage variation - the format runs out of coordinates.

Outrage's OSF encoder takes 22 KHz/16-bit wave files, so you pretty much have to make sure the average dynamics are close to -18 dB. Of course you don't have to squash everything into the top -6 dB, and there should be some breathspace, but remember this: not recommended to go below -18 dB (unless for passages which are not intended to be very textured/detailed) or you'll lose detail.

Back to squashing though, usually a 6 dB or so squash is it.

And the third and final tool is the resampling tool itself. Secret Rabbit Code, AKA Libresamplerate, is already included in Audacity, or, you can get the very nice Foobar2000 plugin and discover that Foobar2000 is a great mass-conversion tool as well as player. When resampling, make sure to add some dither (0.1 - 0.5 bits of dither are it) as the wave will be quantised to 16-bit (most DAWs nowadays mix to 24-bit and 32-bit, which is as it should be, as it gives an abundancy of coordinates). Dither is low-level noise (usually around -60, -70 dB) used to cover up any spikes that can pop up when there are suddenly only 65536 coordinates instead of millions at higher bit depths. Some people detest the very notion of adding any noise to the mix, but the trick is, it's noise that prevents bigger noise spikes, so it's fair. More or less. Again, it's a workaround for the format's deficiencies.

Resampler quality will affect the final quality a lot. An older editor will usually produce something fairly crude-sounding, whereas SRC is probably the most accurate tool there is, even when downsampling.

So once it's all done, the 22 KHz/16-bit wave file lowpassed, squashed and downsampled from whichever higher-frequency format you work with (here it's 96/32) can be converted by the Outrage OSF encoder. Hoorray.
« Last Edit: October 24, 2010, 01:30:34 AM by S-Priest »


An Error Has Occurred!

Cannot create references to/from string offsets