I'm writing a simple convolution plugin, and face a problem with the convolution 'tail' beyond buffer size. 
Actually the convolution take place fine within the buffer size boundaries, but at each end of buffer, the tail of convolution is either truncated, or feedingback.
void MultitapAU::MultitapKernel::Process(const Float32 * inSourceP,
						Float32 *inDestP,
						UInt32	inFramesToProcess,
						UInt32	inNumChannels,
						bool &ioSilence )
{
	float coefficients [10] = {1.,0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1} ;
	int numTaps = 10;
	float delay[inFramesToProcess+numTaps];						// my coefficients 
	
	memcpy(delay, delay+inFramesToProcess, sizeof(float) * numTaps-1);		// i.e. save content by moving samples at beginning
	memcpy(delay+numTaps-1, inSourceP, sizeof(float) * inFramesToProcess);  	// i.e. fill the last samples with input signal
	
	vDSP_conv(delay,1,coefficients+numTaps-1,-1,inDestP,1,inFramesToProcess,numTaps);// convolve "delay" with "coefficient" & output into inDestP
}
Any hints on how making the convolution preserving the samples that would be beyond "inFramesToProcess" ?
I also tried a circular buffer implementation, but face the exact same problem so I posted this one as it's simpler to read.
Thanks !
Salvator