Re: Convolution Audio Unit how to?
Re: Convolution Audio Unit how to?
- Subject: Re: Convolution Audio Unit how to?
- From: Mark Heath <email@hidden>
- Date: Wed, 23 Nov 2011 22:26:45 +1100
On 23/11/2011, at 4:34 PM, Brian Willoughby wrote:
On Nov 22, 2011, at 13:55, Mark Heath wrote:
On 22/11/2011, at 4:27 PM, Brian Willoughby wrote:
You may be more likely to find examples that are FFT-based where
you could get an idea of how to handle the windowing, and then it
would be a simple matter to substitute your convolution matrix for
the FFT processing.
I had found a page that described the problem http://web.willbenton.com/writing/2004/au-effect
but had no code examples.
There is both a 'latency' attribute and a 'tail' attribute. As
you correctly surmise, there are issues with the first samples and
therefore the necessity of a tail. The size of the window
determines your latency and tail. After you determine the needed
latency and tail, you will code these values into your AU, either
as a constant or as a value that is calculated on the fly from run
time information such as the sample rate. The AU spec also
provides a reset function so that you can clear out any memory for
the window or state variables when the transport changes position
in a discontinuous fashion. These aspects of the AudioUnit
specification allow the AU host to coordinate all necessary
aspects of windowing with each individual plugin.
I do have my buffering code sorted out, I didn't quite know about
the latency or how to implement the tail. Or if this was indeed
the correct way to do this.
The documentation I read regarding tail were talking about reverb
filters. Where N samples in produce N + decay samples out. My
filter is still N samples in to N samples out.
No offense intended, but I have my doubts that your buffering code
is sorted just yet.
I can see why you think this as I may know what I'm doing but I do not
know what I'm talking about :-)
I have only described my initial problem. I haven't begun to describe
my buffering, I wasn't sure if it was implemented correctly. I
wondered how I determine if my Filter was Process()ing the left or
right channel. This appears to be a moot point as the host
application (or parent class or whatever) starts an instance per
channel.
Also I didn't know how to correctly handle the samples in the first
call to Process() as I need to read ahead windowsize/2 samples. So I
would only be able to output inFramesToProcess minus windowsize/2.
eg if the partition size was 512 samples and my window size was 16
samples. then I could only output 504.
It appears that I simply zero pad the firs 8 samples of the output
partition and set a latency (of 8 / samples per second). correct?
I was under the false assumption that I must process input sample 1
into output sample 1's position but wondered what I did towards the
end of the input buffer. I guess that this assumption is only a
requirement for non realtime filters, that I don't output anything
until I have read enough input. This might be a flawed when applying
it to Audio Units (or other realtime audio frameworks).
Is this true, that for filters which require look ahead the very start
of the output must be padded with 0s?
There are two aspects to each of the latency and tail parameters: 1)
reporting the time durations, and 2) implementing the actual
features. In terms of tail, you probably should report 0.0 seconds
of tail time and implement nothing. I probably should not have even
mentioned tail. The latency parameter is the one where you'll
report your window duration, but reporting it is separate from
actually implementing it.
If I report 0 as tail time does this mean that I will lose windowsize/
2 samples from the end?
As to your 2nd question, most AudioUnits have what is known as an
Kernel object. These objects are dedicated, one to a channel. If
you need state variables such as memory for the windowing, then
you need to add these variables to the Kernel object, not the
overall audio engine object. Using your terminology, the "stored
sample buffer" should be a member of the Kernel object, and then
the incoming channel buffer will always match the stored state.
So clarifying, there is one instance of my AU class per channel,
any buffering I am doing in this instance will not clash with
another channel?
This is probably the missing information I'm after.
Sorry to be pedantic, but your question is too vague. As mentioned
in the article that you linked above, there are two AU classes:
AUEffectBase and AUKernelBase. You will be extending both of those
classes to implement your plugin.
From reading the Tremelo example on Apple's site they only extended
the AUKernelBase. Again my confusion for saying AUEffectBase.
If you were to place your buffer in the AUEffectBase subclass, then
you'd have problem. Any such channel buffering belongs in the
AUKernelBase subclass. AUEffectBase takes care of creating one
instance of the kernel for every channel in your plugin
instantiation, thus there should be no clash between channels.
For the first call to my Process method, instead of trying to
output windowSize/2 less than the inFramesToProcess, I simply pad
it with 0s at the start and set the latency and tail.
A) I have no idea where you got the idea that you would output
windowSize/2 samples.
Might be my head is running quicker than my fingers, I meant
inFramesToProcess - windowSize/2
B) You do not set latency and tail during Process. They are
reported separately, and are not part of the render process at all.
They serve as descriptions of your algorithm that the AU host will
need to know about in advance of render time so that preparations
can be made to latency-compensate your plugin in order to time-align
it with other plugins that might have a different latency. If your
window size depends upon sample rate, then you need to make sure
that this is reflected in the reported value. Otherwise, you can
just report a fixed number if your convolution window size is always
constant.
This is interesting as the window size is a user settable parameter.
According to the example I must be able to handle the user changing
these parameters while the filter is running. Does this mean that I
cannot change the latency? Or that I must write my filter in such a
way that the latency is fixed regardless of the parameters?
Basically, your convolution core will need a couple of working
buffers, one for input and one for output. These will be sized
according to your windowing needs, plus some optional overhead for
copying, and thus the size will be independent of the
inFramesToProcess. You will need to copy inFramesToProcess samples
from the CoreAudio buffers to your working buffers, and you need to
keep count of how many valid samples are in your working buffer.
Before your working buffer accumulates enough samples to run a
convolution, you'll have nothing to provide to the output, and thus
you'll need to pass all 0s to the output before the first
convolution calculation is done. After your working buffers first
reach the point of being filled sufficiently, at that point you can
pad your output with 0s at the start and the new output samples from
the output working buffer at the end.
I've made a buffering implementation that assumes that the
inFramesToProcess is larger than my window size. There is possibly a
case where this is not true, which I have not written for.
Keep in mind that it may take more than one render call before your
convolution can produce output. Taking a perhaps extreme example,
let's say your convolution needs 8192 samples for its window. Then
let's say the AU host is rendering 4096 samples per render call.
You might be able to start returning non-zero output samples on the
second render, but you might starve the stream unless you wait until
the third render. Also, consider what happens to your required 8192-
sample window when the AU host uses 512 samples per render. Worse
yet, what about 512 samples typically, but even fewer than 512
samples during certain ramped parameter render calls that might be
shorter than 512 samples? You can see that it might even take
hundreds of calls before your window size is met.
How does the framework handle no output samples? Zero padding?
I assumed that if I was given 512 samples to process that I must
output 512 samples.
Which is why I was able to produce a working concept that shows that
the theory is correct, but because I don't process the edges of the
partitions, it generated distortion.
There are a number of ways to handle this, but you basically need a
FIFO that is large enough to hold everything that your convolution
calculations need, plus you must have some sort of counter to keep
track of how much input and output data is waiting in the input and
output working buffer (FIFO).
I can see that this would be needed if the window was larger than the
partition size. Maybe I need to take another step back and implement
this as a generic case rather than my current implementation that
assumes my window will be smaller than the partition size.
Mark
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden