• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: vDSP_conv very slow?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vDSP_conv very slow?


  • Subject: Re: vDSP_conv very slow?
  • From: Chris Johnson <email@hidden>
  • Date: Thu, 8 Feb 2007 15:47:13 -0500


On Feb 8, 2007, at 3:15 AM, Cor Jansen wrote:

// add input samples to end of bufSignal
// first move bufIn contents
for (i=0 ; i<ROOMCORRECTION_FILTSIZE-1 ; i++) {
    bufSignal[i] = bufSignal[nSampleFrames+i];
}

This is very bad. Ian Kemmish suggests:
3) Instead of making bufSignal[] jsut large enough, and copying samples around on every call, make it a ring buffer twice (or more) as big as necssaary, and gradually march through it. Then, you'll only need to copy samples from the end to the beginning when it wraps around. You may need to experiment to find the best size here - the bigger the buffer is, the less copying you do, but the greater the chance of taking a performance hit from cache misses.


I find you can either do a single looping buffer, the same size you'd expect it to be:

count = your offset within the buffer;
if (count < 0 || count > ROOMCORRECTION_FILTSIZE) {count = ROOMCORRECTION_FILTSIZE;}
//stay within the buffer size- sanity checks BEFORE accessing any array value


	buf[count] = newsample;
	do stuff with buf[(count+whatever)% ROOMCORRECTION_FILTSIZE];
		//stops things from running off the end of the buffer

	count--;

but then you still have to check every operation in the loop for overflow, which is where 'twice as big as necessary' comes in.

count = your offset within the buffer;
if (count < 0 || count > ROOMCORRECTION_FILTSIZE) {count = ROOMCORRECTION_FILTSIZE;}
//stay within single buffer size- sanity checks BEFORE accessing any array value


buf[count+ ROOMCORRECTION_FILTSIZE] = buf[count] = newsample;
do stuff with buf[(count+whatever);
//now if you overflow it's the same as if it had wrapped- notice instead of accessing an array value of
//count+whatever and then having to mod it to stay in the array, you just add count+whatever, knowing that
//the overflow area's going to be correct data. One extra data assign, twice the buffer, but one less operation
//for every single sample in the kernel.
//Actually, I do a lot of stuff just straight-up hardcoded, not even using loops for my kernels


	count--;

Does that help? I do time-based convolution kernels this way, and it's hard to get more CPU-intensive than that- going to have to learn FFT code to do larger kernels, but these small kernels sound extra nice with convolution-the-hard-way- but you have to think like a game programmer to get the stuff to run efficiently. I've seen side- scroller games use similar tricks for background images.

	Chris Johnson
	airwindows
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: vDSP_conv very slow?
      • From: Chris Johnson <email@hidden>
References: 
 >vDSP_conv very slow? (From: "Cor Jansen" <email@hidden>)

  • Prev by Date: Re: vDSP_conv very slow
  • Next by Date: Re: vDSP_conv very slow?
  • Previous by thread: Re: vDSP_conv very slow?
  • Next by thread: Re: vDSP_conv very slow?
  • Index(es):
    • Date
    • Thread