Re: IOAudioMixerEngine.cpp
Re: IOAudioMixerEngine.cpp
- Subject: Re: IOAudioMixerEngine.cpp
- From: Jeff Moore <email@hidden>
- Date: Thu, 8 Aug 2002 12:45:02 -0700
Bill's recommendations for a general approach to implementing hardware
mixing is pretty reasonable.
Another way of presenting the controls of each channel would be to
implement a HAL plug-in that provides your custom controls as
properties. That way your device's controls are presented in a way
consistent with the rest of the HAL's controls.
A useful feature that could be added is dynamic channel allocation.
This would make it easier to keep all the processes that are trying to
use the card at the same time from having to all agree on the channel
allocation (which could be difficult and confusing for the user), as
well as keep the amount of data moving around to a minimum when
channels aren't in use.
I'm not sure how I would go about implementing dynamic channel
allocation though. There may need to be some changes in both the HAL
and the IOAudio family to facilitate this sort of thing. I haven't
thought too deeply about it yet.
It would be worth while to file a Radar bug describing exactly what
your needs and desires are in this area.
On Thursday, August 8, 2002, at 12:32 AM, Bill Stewart wrote:
on 8/7/02 11:10 PM, Nathan Aschbacher wrote:
Well I can think of one quick way to boost performance of the default
mixOutputSamples function. You're doing 4 scalar single precision
floating
point additions in order. This seems like a perfect candidate for
loading
the four values from the mixBuf into a 128-bit vector and loading the
four
values from sourceBuf into another 128-bit vector and then just doing
a
vector addition operation on them and storing the result back to the
mixBuf.
At the very least you'd be moving some computation from a general
purpose
processing unit off to a more specialized unit that sits there idle
plenty
of the time. I may try changing the code to function like that and
recompile the base system audio drivers from CVS and see how it works.
There are some situations that Altivec won't help us with - ie, a small
enough chunk of data, data that is not in 4 float divisors, etc - so we
haven't done enough work with this to know when this will help and
when it
will hinder - but sure, this is definitely something worth doing (but
not as
clear cut as it would first seem)
Anyhow that's beside the point. Matt is right. What I'm looking to
do is
be able to take load off the CPU.
Sure, understand that.
Whether the sound processor is faster or
not isn't as important to the process as taking the load of the CPU.
In this particular situation, sure
What I'm trying to build here is what's called "hardware accelerated
audio" in
the Windows world. Where the audio hardware can reach into the
systems
audio kernel buffers very early in the process and perform much of the
required work so the CPU doesn't have to. It has a measurable
performance
difference on Intel PC's even in non-gaming scenarios simply because
the CPU
is freed up during the playing of audio. So what I've been trying to
determine is 1) is this even possible? 2) If not, why not? What's the
limiter in MacOS X that prevents this kind of functionality? 3) If
so, then
where ought I to be looking to hook into MacOS X's audio processing
stages
to have the most success using an external audio processor to do some
work
and free up the main CPU?
So if IOAudioEngine::mixOutputSamples is the last stage of the CPU
handling
the audio buffers, then I'm going to want to attack this problem from
higher
up the chain. My question then becomes, where? I'm to understand
that this
has never been done before on the Mac, and myself and an eager group
of
other developers are interested in seeing this happen. However where
the
Windows driver development documentation for a DirectSound driver
make this
process and it's purpose very clear, the lack of an obvious parallel
to the
MacOS X sound system is making things complicated.
This really is a difference because Windows has always shipped on
hardware
that has some external sound cards, and most of these are SoundBlaster
type
cards with these kinds of capabilities - Apple's never shipped
hardware that
can do this (and most of our CPU's don't have PCI slots to add this) -
not
excusing this, just trying to explain why...
It also sounds like Jaguar may provide me with some better tools to
work
with however. The native format capability ought to be very handy.
My
concern was that the CPU's burden of running the float -> int
conversions so
that the card (which only works on 32-bit integers) can do some of
the work
would be more overhead added than would be saved by doing the mixing
on the
card.
The native format (like with ANY usage of this feature) is there for
Apps to
CHOOSE to use - but for those apps that don't use this, they will
incur the
overhead of the float to int - but presumably they can do this because
the
overhead of this doesn't kill their capability to do what they need to.
The native format then, is a nice optimisation for those that need it.
Your
driver doesn't really care whether the client is dealing with floats or
ints, because the hardware engine will only see the int data (and of
course
inherits the loss of headroom that dealing with int data entails)
So, what is it that your card does? We should concentrate the
discussion on
this - the float->int is really the LEAST or your problems...
Lets presume that your card can take say 32 channels of input, mix
them and
apply some kind of 3D sound to them - then it outputs 4 channels..
Lets also presume that your card is doing this work and outputting the
data
- ie. There is NO need or capability to have the card process the
samples
and then re-present them back to the CPU for further processing...
My first approach would be to publish those 32 channels as the format
of the
device - you could also publish the 4 real channels of the device as an
alternative output format if you can turn off these hardware based
processes
and just go straight to the DACs... (You should do this if you can)
The trick is to get the control information or parameters to the
card...
One potential way to do this is to publish another non-mixable stream
(with
your 32 channel version) from the card that is your control stream. It
should be a CBR stream (ie. Same number of bytes every I/O cycle), and
the
maximum number of bytes that would represent the maximum number of
commands
that can be accepted by the device for a max number of sample frames
you can
process - its arguable here, that this format is ONLY usable when the
app
has your device in Hog mode - but I'm not sure that we actually
support that
semantic (and maybe we should)...
The PCM streams should be mono buffers - with Jaguar a client can turn
streams on and off on a device (for instance, if it wants to use only 8
channels for output, it can turn off the streams of the unused ones)...
Then - the data contents of that stream could be an array of structs:
Struct {
short channel;
short frameNumber; -> this is an offset into the current buffer
short controlNumber;
short controlValue:
};
The HAL will always zero out the IOProc's data before it passes it to a
client, so as long as you don't use a channel number of zero, you can
determine when the valid list ends by seeing a channel == 0 - you're
done!
(This would actually also tie in nicely with the HAL's concept of
numbering
channels - it reserves channel==0 to be the device, and channel1 is the
first real channel in the device's streams...)
(I'd also prefer this than publishing separate streams of control data
on a
per channel basis as the density of the control data is in most cases
very
sparse and there'd be too much data going over the bus if you did it
that
way)
You also know (and the client knows) that the data it supplies and the
control data are paired to that I/O cycle... So there's a nice
locality of
the PCM data with the control data.
Make any sense?
Bill
Anyhow I'm still trying to piece together a clear picture of how what
I
desire to do fits into the MacOS X CoreAudio API's stages and
capabilities.
Though I VERY much appreciated the thoughtful responses.
Thank You,
Nathan
_______________________________________________
coreaudio-api mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/coreaudio-api
Do not post admin requests to the list. They will be ignored.
--
mailto:email@hidden
tel: +1 408 974 4056
_______________________________________________________________________
___
"...Been havin' some trouble lately in the sausage business," C.M.O.T.
Dibbler replied.
"What, having trouble making both ends meat?"
_______________________________________________________________________
___
_______________________________________________
coreaudio-api mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/coreaudio-api
Do not post admin requests to the list. They will be ignored.
--
Jeff Moore
Core Audio
Apple
_______________________________________________
coreaudio-api mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/coreaudio-api
Do not post admin requests to the list. They will be ignored.