Re: sample data types (iOS)
Re: sample data types (iOS)
- Subject: Re: sample data types (iOS)
- From: Brian Willoughby <email@hidden>
- Date: Mon, 7 Feb 2011 21:11:24 -0800
On Feb 7, 2011, at 19:21, Schell Scivally wrote:
Hi all. I'd like some information on using different data types as
sample values. I've read that using 8.24 fixed-point samples
(AudioUnitSampleType) is fastest, though I admit I don't quite know
what 8.24 means in terms of bits (I'm guessing it means 24 bit
precision right of the decimal and 8 bits left of the decimal). Is
this a 32 bit signed float?
Nope, 8.24 fixed-point is not 32-bit signed float. Fixed-point means
that the value is specifically not floating-point. In floating
point, the whole and fractional parts of the number are constantly
shifted around, or normalized. But in fixed-point the whole and
fractional parts always have a "fixed" number of bits. Also, there
is never an exponent for fixed-point - only floating point has the
separation of exponent and mantissa.
Basically, 8.24 fixed-point will mostly work the same as 32-bit
signed integer types, at least for addition and subtraction. But
division and multiplication will have to follow the mathematical
rules for shifting the point between whole and fractional.
One thing to keep in mind is that from the hardware arithmetic unit
point of view, multiplying two 32-bit values actually results in a 64-
bit product, although some fixed-point processors simply truncate the
32-bit inputs to 16-bit so that the product will safely fit within 32
bits, and general purpose processors might simply overflow if the
actual product does not fit within 64 bits. Hopefully it makes sense
that multiplying two 8.24-bit fixed values results in a 16.48 fixed
value. At some place in your code, you have to shift the values so
that the 16.48 result gets back to 8.24 (possibly from an 8.56
intermediate value). If you don't have the processing power to hold
intermediate values in 64-bit precision, then you'll need to pre-
scale before multiplication, by changing the multiplicands to 4.12
representation beforehand, so that the results are 8.24 in the end.
There are actually many ways to accomplish the shifting, before and/
or after the multiplication, and sometimes it's easier if you know
one of the multiplicands only has a few bits of precision, especially
if the significant bits all always in certain bit positions. In
other words, you don't have to shift both multiplicands by the same
amount; rather you can shift one multiplicand by more bits so that
the other multiplicand can be shifted by fewer bits and still produce
the correct result. The trick is to maintain precision without too
much truncation or excessive rounding.
Division can be more difficult. Basically, you start with 64-bit
precision, and dividing by a 32-bit value gives a 32-bit result.
That's the point of view of the hardware. But division is always
trickier, because if the denominator is too small then the result
could blow up to be more than 64 bits, and certainly way more than 32
bits. The shifting and scaling are basically the same as
multiplication, though.
There are lots of considerations - too many to detail here - but the
key is that the C programming language does not have direct support
for fixed-point operations (besides addition and subtraction, where
it's the same as integer math). You have to use math subroutines or
really know what you're doing to process in fixed-point when writing
in C.
The problem is that iOS is not very efficient with float, and thus if
you take the easy path of dealing with floating-point, then your code
will run a lot slower than it would with 8.24 fixed-point code. It's
probably easier to write in assembly, where there aren't as many
hurdles when dealing with 8.24 fixed as there are when writing in C.
Of course, assembly is already much more difficult than C, but at
least if you master assembly then the 8.24 will look easy in
comparison. Besides assembly, you could also use primitive functions
for multiply and divide rather than the usual '*' and '/' operators.
I suppose it's also possible that someone has bothered to create a C+
+ fixed-point type that would hide all of these details.
It is confusing that then the canonical
input/output type (AudioSampleType) is a 16 bit signed integer. Can
anyone provide some clarification on how and when to use which type?
The choice of 8.24 is determined by the processor. The choice of 16-
bit signed integer is determined by the audio I/O hardware. Keep in
mind that the processor is not the same chip as the audio converters
(although sometimes a CPU will have built-in converters). The ARM
processor in iPod, iPhone, and iPad devices handle fixed-point more
efficiently than floating-point, but since we want 24-bit precision
if at all possible, then 8.24 fixed becomes the optimum choice.
That's why 8.24 is the canonical format. On the other hand, low-
power devices like these generally cannot handle 24-bit A/D and D/A
because of the increased power and cost. Instead, they have simpler
conversion hardware that handles CD quality 16-bit I/O. Thus, it
doesn't make much sense to communicate 8.24 or even straight 24-bit
data with the audio converter when it's only dealing with 16 bits
anyway.
In other words, use 16-bit signed integer when you grab audio data
directly from the microphone or line input. Also, provide 16-bit
signed integer when you do your final render before sending audio to
the speaker. Basically, that's the HAL part of CoreAudio. At all
other times, e.g., when dealing with sample rate conversion or
anything else that would alter the raw audio samples, then you should
use 8.24 fixed point for precision - and because that's the canonical
format for AudioUnits.
It's a little confusing, I agree, because there is AUHAL that links
the AudioUnit and HAL worlds together. I can't remember at the
moment whether iOS supports AUHAL, but that would be an exception to
the rule - I'm pretty sure you'd use 8.24 for the client side of
AUHAL, and it will automatically convert to 16-bit signed integer for
the audio converter hardware.
Now, in a render callback if my ASBD needs float values, should I be
generating values between +-1.0 or +-CGFLOAT_MAX or neither?
Why do you say that your render callback needs float values? Who is
setting this requirement? Your render callback would run more
efficiently if you rewrite it to deal with the canonical 8.24 fixed-
point format.
In any case, the expected range is +/-1.0, but you just have to be
mindful of how that translates to the 8.24 canonical format. Also,
you won't literally have +1.0 on the final output without clipping,
only -1.0 can be expressed in twos-complement. But it's perfectly
fine if your AudioUnits temporarily exceed +/-1.0, so long as you set
the gain to bring samples within +/-1.0 before output. Also, when
using the 16-bit signed integer for audio hardware I/O, the values
range from -32768 to +32767.
Sorry if my answers don't make this easy. It's actually quite a
challenge to get a battery-operated device to perform significant
audio DSP without carefully controlling the amount of processing
power that you use. Unfortunately, we've been spoiled over the last
couple of decades by desktop computers that can handle floating point
calculations as fast, or almost as fast, as straight integer
calculations. The new generation of programmers might never need to
think about this distinction, or the shortcomings of the C language
for dealing with fixed-point, except for the new low-power portable
platforms that are now becoming prevalent.
Brian Willoughby
Sound Consulting
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden