First, I would like to thank you again for your valuable insights. I was able to create a Direct Form Transposed Type I filter (not shown here)
I then figured out how to create a cascaded version of the filter – It just required a little imagination.
However, in the equation of the Direct Form Transposed Type II filter, I borrowed the equation from Nigel.
I thought that someone could benefit from this, which is written in Swift 2. I did my best to make it fast while keeping readability
The entire program is not provided because it would be entirely too long but all the essential parts needed for understanding are present.
See the function “presentForTesting()” which calls all the required functions or methods
>> I'd skip using that vDSP function, personally. First, it's single precision….
I was severely trounced, beaten about the head and shoulders by vDSP_biquad, implemented in the function processBiquadFilter_Accelerate. Below is a test of 1 minute of audio source filtering. Most test runs
looked much like this.
Noise source frame length 2646000
processBiquadFilterCascadeWithBufferForm2 Time elapsed for filtering: 0.102008044719696 s
processBiquadFilter_Accelerate Time elapsed for filtering: 0.0303190350532532 s
Speed Differ %: 236.448849841449
>> The IIR implementation itself is trivial, and very fast as an inline…
The Direct Form Transposed Type II filter was a little more than twice as fast and the Type I, so I abandoned it the Type I, though it was a good exercise for me.
I dont know if it is Swift or me but (not shown here) even after I brought in the coefficient array by reference and laid it out the same way that Apple wants it for vDSP_biquad I found that processBiquadFilterCascadeWithBufferForm2
when I changed my coefficient array from an array of arrays that hold doubles e.g.[[ a0, a1, a2, b1, b2], [ a0, a1, a2, b1, b2], [ a0, a1, a2, b1, b2]], which makes the most sense to me because the cascades are just groups of 5 coefficients to just an array
of doubles [ a0, a1, a2, b1, b2, a0_1, a1_1, a2_1, b1_1, b2_1, a0_2, a1_2, a2_2, b1_2, b2_2] like apple wants it, the function was actually slower
My largest concern was problem was that using vDSP_biquad, I really had to process memory twice ( once for each channel of non-interleaved stereo audio). If I were reading in a large amount of data that was
pulling from disk, I would incur a heavy penalty from accessing the disk. This is because I would process all the left channel data and then turn around and do the same for the right channel. There is a vDSP biquad version that deals with multichannel data
but that really seems tortuous to implement.
Out of curiosity, I would like processBiquadFilterCascadeWithBufferForm2 to be faster but right now it a choice between a blink of an eye verses a fraction of a blink of an eye.
Oh, just in case anyone at Apple is taking recommendations, it really should
NOT be hard to figure out how to call vDSP_biquad. Having to figure out that to pass the proper value for things like a Float audio channel, that you must convert sourceChannels[0] to an UnsafePointer<Float>. That was a miserable experience, even when
looking at the headers.
var inSource = UnsafePointer<Float>(sourceChannels[0]) // sourceChannels[0], left channel
Thank you all for the help you provided.
W.
func processBiquadFilterCascadeWithBufferForm2( inout inSource: AVAudioPCMBuffer, inout outSource: AVAudioPCMBuffer , inout coeff: [[Double]] )
{
let delaysPerStage = 2
let stageCount = coeff.count
let numOfCoeff = 5 // [ a0, a1, a2, b1, b2 ]
let totDelayElems = delaysPerStage * stageCount
let doubleZero = Double(0.0)
let _a0 = 0, _a1 = 1, _a2 = 2, _b1 = 3, _b2 = 4 // coefficient naming for clarity
let _xD0 = 0, _xD1 = 1 // delay 0 and delay 1 for clarity
var a0 = doubleZero, a1 = doubleZero, a2 = doubleZero, b1 = doubleZero, b2 = doubleZero
// create delays per channel, chanDelay[stage][_xD0], chanDelay[stage][_xD1]
// create an array of elements, initialized to 0.0 and copy the array for channel 1
var chan0Delay = [Double](count: totDelayElems, repeatedValue: 0.0)
var chan1Delay = chan0Delay
/*
Written for Swift, adapted from Nigel Redmond
http://www.earlevel.com/main/2003/02/28/biquads/
coeff[stage][_a0] = a0
coeff[stage][_a1] = a1
coeff[stage][_a2] = a2
coeff[stage][_b1] = b1
coeff[stage][_b2] = b2
chanDelay[stage][_xD0] = z1 -- delay 0
chanDelay[stage][_xD1] = z2 -- delay 1
the for-loop implements a biquad filter, Transposed Direct Form II:
coeff = [ a0, a1, a2, b1, b2 ]
for index in 0 ..< numSamples
{
x = input[index]
out = x * coeff[0] + z1
z1 = x * coeff[1] + z2 – coeff[3] * out
z2 = x * coeff[2] – coeff[4] * out
output[index] = out
}
*/
var numSamples = Int(inSource.frameLength )
var sourceChannels = UnsafeBufferPointer(start: inSource.floatChannelData, count: Int(inSource.format.channelCount))
var sourceLeftChan = sourceChannels[0]
var sourceRightChan = sourceChannels[1]
var resultChannels = UnsafeBufferPointer(start: outSource.floatChannelData, count: Int(outSource.format.channelCount))
var resultLeftChan = resultChannels[0]
var resultRightChan = resultChannels[1]
var inputChannel0 = doubleZero, inputChannel1 = doubleZero
var outChan0 = doubleZero, outChan1 = doubleZero
for index in 0 ..< numSamples
{
inputChannel0 = Double(sourceLeftChan[ index ])
inputChannel1 = Double(sourceRightChan[ index ])
for stageIndex in 0..<stageCount
{
var xDel0 = stageIndex * delaysPerStage + _xD0 // calc offset for the equivalent of chanDelay[stage][_xD0]
var xDel1 = stageIndex * delaysPerStage + _xD1 // calc offset for the equivalent of chanDelay[stage][_xD1]
a0 = coeff[stageIndex][_a0]
a1 = coeff[stageIndex][_a1]
a2 = coeff[stageIndex][_a2]
b1 = coeff[stageIndex][_b1]
b2 = coeff[stageIndex][_b2]
outChan0 = inputChannel0 * a0 + chan0Delay[xDel0]
chan0Delay[xDel0] = inputChannel0 * a1 + chan0Delay[xDel1] - b1 * outChan0
chan0Delay[xDel1] = inputChannel0 * a2 - b2 * outChan0
inputChannel0 = outChan0
outChan1 = inputChannel1 * a0 + chan1Delay[xDel0]
chan1Delay[xDel0] = inputChannel1 * a1 + chan1Delay[xDel1] - b1 * outChan1
chan1Delay[xDel1] = inputChannel1 * a2 - b2 * outChan1
inputChannel1 = outChan1
// take the result and feed into the next cascade section
}
resultLeftChan[ index ] = Float(inputChannel0)
resultRightChan[ index ] = Float(inputChannel1)
}
}
func processBiquadFilter_Accelerate( inout inSource: AVAudioPCMBuffer, inout outSource: AVAudioPCMBuffer, coeff: [[Double]] )
{
var numStages = coeff.count
var numSamples = Int(inSource.frameLength )
var sourceChannels = UnsafeBufferPointer(start: inSource.floatChannelData, count: Int(inSource.format.channelCount))
var resultChannels = UnsafeBufferPointer(start: outSource.floatChannelData, count: Int(outSource.format.channelCount))
var numOfDelayCount = 2 * numStages + 2
/*
Written for Swift, adapted from Alex Kenis
https://alexkenis.wordpress.com/2014/02/05/updating-apples-filterdemo-to-use-the-vdsp-biquad/
float setup delays for the biquad loop
array number = delays[2 * N + 2] where N = number of cascaded biquads
add a section to the {0.0} for each stage (i.e. for 2 biquads = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0}, etc)
*/
var delays_L = [Float](count: numOfDelayCount, repeatedValue: 0.0) // create an array of floats, initialized to 0.0
var delays_R = delays_L
// filterCoeffs array count = [5 * N] where N=number of cascaded biquads
// present the Apple layout by unrolling the coefficients
var filterCoeffs_L = setupAppleCoeffArray( coeff ) // unroll
var filterCoeffs_R = filterCoeffs_L
var setupL = vDSP_biquad_CreateSetup(filterCoeffs_L, vDSP_Length(numStages))
var setupR = vDSP_biquad_CreateSetup(filterCoeffs_R, vDSP_Length(numStages))
var stride = vDSP_Stride(1)
var length = vDSP_Length(numSamples)
var delayPtr = UnsafeMutablePointer<Float>(delays_L)
var inSource = UnsafePointer<Float>(sourceChannels[0]) // sourceChannels[0], left channel
var outSource = UnsafeMutablePointer<Float>(resultChannels[0])
vDSP_biquad(setupL, delayPtr, inSource, stride, outSource, stride, length)
delayPtr = UnsafeMutablePointer<Float>(delays_R)
inSource = UnsafePointer<Float>(sourceChannels[1]) // sourceChannels[1], right channel
outSource = UnsafeMutablePointer<Float>(resultChannels[1]) // resultChannels[1], right channel
vDSP_biquad(setupR, delayPtr, inSource, stride, outSource, stride, length)
vDSP_biquad_DestroySetup(setupL) // Destroy
vDSP_biquad_DestroySetup(setupR) // Destroy
}
func buildCoefficientArray( inArray: [[Double]], a0:Double, a1:Double, a2:Double, b1:Double, b2:Double ) -> [[Double]]
{
var coeff = inArray
coeff.append([a0, a1, a2, b1, b2])
return coeff
}
func setupAppleCoeffArray( inCoeffArr: [[Double]] ) -> [Double]
{
var coeff = [Double]() // init a simple array of doubles
for outerIndx in 0..<inCoeffArr.count
{
for innerIndx in 0..<inCoeffArr[ outerIndx].count
{
// unroll all the elements and bind them to a single array of doubles
coeff.append(inCoeffArr[outerIndx][innerIndx])
}
}
return coeff
}
func setup8thOrder300HzHighpass() ->[[Double]]
{
var coeff = [[Double]]() // inits an array that holds an array of doubles
let doubleZero = Double(0.0)
var a0 = doubleZero, a1 = doubleZero, a2 = doubleZero, b1 = doubleZero, b2 = doubleZero
/*
FilterType: highpass
F0: 300.0
SampleFreq: 44100.0
PeakGain: 0.0
FilterOrder: 8
Q: 0.509795579104159
Q: 0.601344886935045
Q: 0.899976223136416
Q: 2.56291544774151
*/
// Q: 0.509795579104159
a0 = 0.959338692805727
a1 = -1.91867738561145
a2 = 0.959338692805727
b1 = -1.91780079001361
b2 = 0.919553981209292
coeff = buildCoefficientArray( coeff, a0, a1, a2, b1, b2 )
// Q: 0.601344886935045
a0 = 0.965249483498449
a1 = -1.9304989669969
a2 = 0.965249483498449
b1 = -1.92961697041488
b2 = 0.931380963578915
coeff = buildCoefficientArray( coeff, a0, a1, a2, b1, b2 )
// Q: 0.899976223136416
a0 = 0.976365039255089
a1 = -1.95273007851018
a2 = 0.976365039255089
b1 = -1.95183792509063
b2 = 0.953622231929729
coeff = buildCoefficientArray( coeff, a0, a1, a2, b1, b2 )
// Q: 2.56291544774151
a0 = 0.991279866688683
a1 = -1.98255973337737
a2 = 0.991279866688683
b1 = -1.98165395153631
b2 = 0.983465515218421
coeff = buildCoefficientArray( coeff, a0, a1, a2, b1, b2 )
return coeff
}
func presentForTesting()
{
/*
Partly pseudocode
Read input data -- white noise source
Create a PCM Buffer to output the filtered data
Open file for writing output data to disk
Setup filter
Process input data
Write output to file
*/
// var amEngine = AudioManipulateFilterEngine()
// var noiseSource = amEngine.readAudioDataIntoBuffer( whiteNoiseURL! )
// var filterOutSource = amEngine.createPCMBuffer(noiseSource.frameLength)
// var outfile = amEngine.openAudioFileForWriting( targetURL, writingDict: amEngine.getDicionary_WAVE() )
var coeff = setup8thOrder300HzHighpass()
processBiquadFilterCascadeWithBufferForm2( &noiseSource, &filterSource , &coeff ) // Example 1
processBiquadFilter_Accelerate( &noiseSource, &filterSource , coeff ) // Example 2
// amEngine.appendBufferToOpenFile( outfile, inBuffer: filterSource )
}
Writing from an iPhone, won't be too eloquent...
I'd skip using that vDSP function, personally. First, it's single precision. If it's a direct form II transposed implementation, it's probably adequate for most cases (but you're cascading four...possibly noise issues at low frequency settings,
I guess still ok). Second, I don't think you're going to see a big performance improvement in a vectored IIR, in typical use cases. So I'd be running a plain-ol' inline IIR with doubles.
The IIR implementation itself is trivial, and very fast as an inline (otherwise the function call overhead is disproportionate, if performance is an issue). Calculating the coefficients takes a little more knowledge, but you can either
use my coefficient calculator (for fixed filters), or the C++ code on my site or elsewhere to calc at runtime.