float sample conversion (was Re: Calculating peak level in db)
float sample conversion (was Re: Calculating peak level in db)
- Subject: float sample conversion (was Re: Calculating peak level in db)
- From: Brian Willoughby <email@hidden>
- Date: Tue, 15 Oct 2002 15:18:58 -0700
[ Programmers everywhere complain about having to clip ~somewhere~,
[ when converting between 16bit and floats, because it puts
[ conditional tests on every sample.
...
[ The alternative is to scale by 1/32768, which ensures a full range
[ of normalized floats, but means you have to map 1.0 (which is
[ ~within range~ for floats) to 32767 on output, clipped from 32768.
Yes, but 1.0001 is also within range for floats, and there is nothing to
prevent a piece of CoreAudio processing code (an AudioUnit or application) from
generating sample values like +2.01 or +4.5, etc. They *all* have to be
clipped when converting to 16 bit, or even when converting to 24 bit fixed.
[ The issue is really: is -32768 the anomalous value, or is +32768
[ a 'missing but legitimate' value?
This is a question that is inherent to 16 bit sampling, and should be studied
from the viewpoint of the choice to use twos-complement binary numbers.
Despite the importance of your question, it should not affect the purity of the
math used to convert between float and 16 bit, especially not on input where
your samples may be converted to 24 bit without your knowledge.
[ My radical speculation was to scale by 32767/32768, and as noted
[ elsewhere it does or course amount to a very small drop in level,
[ which indeed does become noticeable on changes to high-level
[ samples (which I should have checked before commenting - I tend
[ to avoid running signals above -6dB), but means that the complete
[ path from ADC to DAC could in principle be conducted without
[ clipping.
Your radical approach does more than drop the level and prevent peak clipping.
What you've done is add quantization noise, and distorted the wave shape (by
introducing a nonlinearity at the -6 dB transition, on both the positive and
negative sides). You've effectively shifted the clip to the middle of the
curve from the peak! The difference between the techniques being compared is
that waveforms will rarely touch +1.0 precisely without going beyond that, but
waveforms will frequently hit the -6 dB level and go beyond. So your approach
will cause problems more frequently than the one used in CoreAudio. Studies
have been done on human perception of hard clipping, but I don't know how bad
it sounds to introduce nonlinearities in the middle of the wave.
I just finished mastering a CD of a live recording using Logic. I recorded at
-12 dB on average, but had peaks above -6 dB. During mastering, the levels
were brought much closer to 0 dB (but not all the way) to reach standard CD
levels. That's a lot of processing, and I want it all to be linear (except for
the limiting)!
[ Either way, the input driver would scale by 1/32768; the question
[ is how does the system convert +-1.0 back to 16bit?
I thought of another way to illustrate this: Imagine a magical 32 bit floating
point D/A converter. If you connected the analog output of this 32 bit
floating point convertor to a 16 bit A/D, you would still end up clipping the
one sample value at +1.0, so there's no way to avoid it with fancy math,
either. Besides, if it came from 16 bit, you'll never see +1.0 unless you
process, and processing might produce +1.0001 and up, as well.
[ Re convergent rounding - this is widely used in DSP chips such as
[ the Motorola 56K series, in their MACR instruction. It is an
[ operation of "round to nearest", and on Intel involves only three
[ ~asm instructions - it is a lot cheaper than a C cast. Maybe it
[ is more expensive on PPC (and is certainly expensive if coded in
[ C). But it is an important process, especially in the absence of
[ dither.
In the absence of dither, rounding differs from truncation by only 1/2 LSB
(I'm excluding "round towards zero" which is asymmetrical).
The human ear cannot hear a DC offset in an audio signal, and nothing but
super-high-end audio gear maintains DC from input to output. So not only will
you not hear the DC offset of 1/2 LSB, but it probably won't reach the speakers
electrically anyway.
[ For a straight i/o copy, it is transparent, but minimizes
[ quantisation errors while keeping them symmetrical, which will in
[ principle be an issue as soon as the signal is modified in any
[ way.
You are not minimizing quantization errors. You are merely shifting them by
1/2 LSB in a signal dependent fashion (i.e. whether the signal is above or
below the -6 dB threshold). This is waveform distortion. You have in effect
mixed in a square wave that tracks the incoming signal. Even though this
square wave has an amplitude of 1/2 LSB, it's still noise that you've added!
And all because 16 bit twos complement numbers have a missing code for +32768
[ The scale factor 32767/32768 is so small that it in effect falls
[ inside the 16bit quantisation distance, except, as noted
[ elsewhere, for signals above -6dB.
(See above)
[ So um, well, a lot of the samples are unchanged! I need to make
[ more tests, but initial experiments indicate that the same method
[ applied to 24bit samples does give identity output, as the (32bit)
[ scale factor is within the quantisation range even at peak values.
I don't quite follow. You were mistaken about your radical process on 16 bit
samples, where you thought it gave identity output, but found that it did not.
Now you are saying it will give identity output for 24 bit? Again, if you
have a radical rounding technique which produces identity, then how can it be
superior to simply scaling - which also produces identity?
[ So no worries, I am not about to release any pathological drivers
[ (I have no plans indeed to develop any!);
Well, I am concerned that your CoreAudio applications will distort digital
audio files that it processes (whether it saves to files or plays them live).
But I suppose there's no harm, so long as you document your radical noise
shaping algorithm.
[ there are several ways of doing 16bit/float mapping,
There's only one that doesn't introduce distortion or noise! :-)
I'm excluding dithering, of course.
Brian Willoughby
Sound Consulting
_______________________________________________
coreaudio-api mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/coreaudio-api
Do not post admin requests to the list. They will be ignored.