Re: Ducking
Re: Ducking
- Subject: Re: Ducking
- From: John Clayton <email@hidden>
- Date: Mon, 20 Oct 2008 11:19:18 +0200
Hi Brian, comments below...
note: down towards the end of my response, I believe the pin drops...
On 20/10/2008, at 10:06 AM, Brian Willoughby wrote:
I believe it would help to examine how this would be done in the
analog world - not because you want to restrict yourself to those
potentially narrower possibilities, but because it might help
illustrate a better way than you've described so far.
In the analog world, ducking is achieved by sending two signals to a
device - a primary signal and a controlling signal - and then the
device performs automatic ducking. Such analog ducking devices are
not too dissimilar from a compressor or limiter. But one
significant characteristic is that the process is generally dynamic
and continuous, such that the volume of the primary signal is always
being altered by the instantaneous level of the control signal. The
amount of ducking is generally proportional to the level of the
control signal, and I assume is rarely a fixed amount. You can set
ratios, and might even be able to get more of a gate effect, or
partial gate, but there generally isn't any delay in the control
signal to process other than attack and decay settings. In the
analog ducking world, a signal is usually mono or stereo, possibly
surround, but treated as 'one' signal conceptually. My apologies if
I have over-simplified analog ducking or left out features that are
available.
What you've described differs from the above because it would
combine more than one control signal (i.e. every signal that you
have marked as 'isAutoDucking'); the amount of ducking would be a
proportion of the primary signal but not the control signal(s); and
there would be a random delay in the effect depending upon the
buffer boundaries.
I was thinking of 'mixing' all the data coming from the non-ducked
tracks into a singleton class instance (although it feels hackish to
me in this form - because that ties my components very tightly
together - not something I necessarily love - and that is certainly
the value of an AUGraph based approach).
I'd imagine that I would collect a one-frame union of all non-ducked
audio - a bit like one really large union of all of the signals. Any
zeroes within this 'union' would represent silence, anything else
represents noise/audio - and I could drive ducking from that union in
the pre-render #2 call.
By the way, I presume a 'bus' and a 'side chain' are in effect the
same thing, correct?
I think you could improve your design by using AUGraph to create a
bus. You would then mix all of the control signals
('isAutoDucking') onto this bus. From then on, you could use a
normal ducking plugin (*) which has a side-chain input for the
control signal. Processing on the audio would then be continuous,
not broken into buffer chunks or otherwise delayed. If you still
prefer the amount of ducking to be a percentage of the primary
(which would effectively be a constant dB drop), then you could
implement more of a gate dynamic with a threshold. Otherwise, you
could implement a standard ducker where the amount of ducking is
proportional to the control signal (instead of the primary signal).
Agreed - I believe its also a more elegant approach, and its use
eliminates the need to have a specific pre-render notification in
place to process the control signals.
But whether I use an Audio Unit (my own, ref: your *) or pre-render
notifications (ref: my design, pre-render #1) - the data flow is the
same, no? The ducker would be processing buffers of audio and
generating a series of signals (one buffer wide), and isn't it
irrelevant whether this is a 'pre-render notification #1' or an Audio
Unit? Either way, the consumer of that signal would still be likely
to get the signal after at least one data frame of delay.
The connectivity of Audio Units either directly or via an AUGraph is
probably my weakest point in terms of experience - so I'm more than
happy to have criticism here.
It might take some tricky AUGraph design to get this to work, since
you'd be mixing a bunch of control signals together before you could
process any of the ducked signals (which I am calling primary
signals, for lack of another word).
And its this exact point where I start to ask myself the question: how
can I ensure that :
a) the control signals don't go further - I suppose if I wrote my own
AU then I can simply take the input on the bus, process it and then
throw it away
b) timing - can I ensure that the control signals get processed before
the ducked signals
Then you would mix all of the control signals and primary signals
into the final audio output (**). In other words, such a graph
would involve some amount of carefully directed feedback, so I'm not
sure what sorts of problems you might run into with AUGraph if you
make any mistakes with the feedback routing.
Oh wait - perhaps I'm thinking too procedurally here. (sound of pin
dropping?) You're suggesting that the control signals are in fact the
'non-ducked' audio? That they are not some extra logical control that
a process/loop uses to manipulate the audio data - they *are* the
manipulated audio data.
(*) Note that I am suggesting that you write a ducking plugin,
unless AUDynamicsProcessor already has the ability to take a side-
chain and manage the gain ratio to meet your needs.
(**) I believe that we've learned (from Apple engineers who have
contributed to this list) that it is rather inefficient to make
multiple connections to AUHALOutput in lieu of creating a mixer
which feeds a single AUHALOutput. I could be remembering that
wrong, or the CPU load differences could be minor. I point this out
because the AUGraph that I describe above would necessarily involve
all of your audio signals in one place, and therefore would be very
easy to mix and connect to a single AUHALOutput.
I hope this helps. If you implement the idea, then please be sure
to license my patent :-)
The application is on its way to you :-)
Brian Willoughby
Sound Consulting
On Oct 20, 2008, at 00:33, John Clayton wrote:
Hi One and All,
I'd like to perform audio ducking in my app and thought I'd throw
the idea out in the open for criticism / review before I go ahead
and write it - as I'm not sure that this is the best way - perhaps
I'm missing something fundamental.
My app has multiple video/audio 'tracks' (basically just a core-data
object that represents some form of media), and at present each
track contains its own, self-contained series / chain of audio
units. The chain looks like this:
QT Movie -> AUTimePitch -> AU HAL Output
The ducking part of the app calls for an attribute on a track called
'isAutoDucking', to allow any track to be ducked (or not). If this
is set to true - then the track should reduce its volume by some
percentage (as defined by the user) during playback, but only if
there is another non-ducked track with audio playing at the same
time. I could in theory reduce the volume of ducked tracks by
calculating the relative volume of other tracks on the fly - but for
now I'm trying to make the problem simple - so I'm trusting the user
to set the amount of ducking [as a %age].
In my opinion, the problem is twofold:
1. figure out when ducking should occur
2. determine by how much a track should be ducked
In my design, (2) is solved by the user specifying a percentage to
duck by, and I'm left thinking that I can implement (1) as follows:
Non-Ducked Tracks:
QT Movie -> AUTimePitch -> [pre-render notification #1 here]
AUHALOutput
the pre-rendering notification #1 - on the Non-Ducked track - is a
way for me to work out whether or not there is audio being processed
by a non-ducked track, I'd likely store a simple boolean in a
singleton somewhere (or maybe a time range of booleans). the goal
being to answer the question: 'is audio being played on a non-ducked
track right now'.
Ducked Tracks:
QT Movie -> AUTimePitch -> [pre-render notification #2 here]
AUDynamicsProcessor -> AUHALOutput
I'd then use the pre-rendering notification #2 to simply set up the
AUDynamicsProcessor to perform ducking, based on the data produced
by the #1 pre-render hook.
My concerns are:
0) Am I missing the point? Is there an easier way to achieve this
with core-audio?
1) this design offers no synchronization - I can't be sure that any
single track is synchronized with others - so my ducking will likely
be out of synch by a couple (few, more, hundreds?) of frames.
2) I have outlined two distinct audio unit chains above, but I think
in practise that I'd have only one - and that I'd just bypass the
dynamics processor for non-ducked tracks.
I'm keen on any input to this design - feel free to point me to any
docs etc that you think would help me get a better grip on the
subject(s).
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
References: | |
| >Ducking (From: John Clayton <email@hidden>) |
| >Re: Ducking (From: Brian Willoughby <email@hidden>) |