Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Multi Channel FFT Audio Anaylzer + Question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multi Channel FFT Audio Anaylzer + Question

Subject: Re: Multi Channel FFT Audio Anaylzer + Question
From: Ian Ollmann <email@hidden>
Date: Tue, 26 Jun 2007 14:02:27 -0700

A question to the list about the vDSP routines: How do these perform/ behave with Intel Macs? I learned they don't use the AltiVec Processor anymore. Does that result in any disadvantages, I mean in terms of processing power? Thanks for any clues.


Just to provide a more direct version of the Accelerate.framework story:

On Intel, the functions should be vectorized to use SSE and later revisions, as available on your hardware. Very little has changed on the PowerPC side. Accelerate.framework still uses AltiVec on machines that support AltiVec. We still support AltiVec in Accelerate.framework, and add AltiVec code as new APIs are added.

We did not add too many new APIs for Leopard, this time around. The vast majority of our time for the last 3 years has been retuning for Intel, to make sure that our APIs meet performance expectations. (Most parts of the OS can get away with making a few fixes and throwing a compiler switch when transitioning to a new chip. We usually have to rewrite from scratch, which takes a while with 7000 entrypoints.) Happily, the PowerPC story hasn't changed much in that time, so there hasn't been much to retune there. We did add a handful of new APIs to vImage. These were vectorized for both SSE and AltiVec.

The vast majority of the Leopard work has been along a couple of fronts:

1) fixing Intel implementations to more closely match what PowerPC did 2) Rewriting Intel code for Intel Core 2. (Many factor of 2 performance wins here.) 3) fixing various bugs that affected usability on either platform 4) Some small amount of retuning for G5, in a couple of cases where our Intel implementation happened to work better on G5 than the old G5 implementation did. 5) 64-bit tidyup (particularly the vDSP section, which wasn't 64-bit for Tiger)

The most common reason to fall into scalar code is that your data is not aligned properly. Check that first. Different parts of Accelerate.framework require different levels of alignment. Documentation on the specific part/function that you are working with should say something about what is required to land in the vector path. A few functions are vectorized for PowerPC and not Intel. Usually, this happens because the algorithm requires some fancy permute work (e.g. three channel RGB buffers) that can't be done cheaply on Intel at the moment due to inadequate hardware permute support. Finally, we have a lot of APIs, so there might also be one or two left over on Intel that we missed. If you find one, file a bug.

In most cases, Intel Core 2 performance should be in the neighborhood of G5 AltiVec performance. Intel Core performance is often half as fast as Intel Core 2. The story changes depending on how closely aligned the particular function is aligned with the instructions available in the ISA, where AltiVec often has an edge, but not always. Some functions benefit on Intel from larger caches. Either platform can benefit from other difficult to predict factors like how much sleep a particular engineer got on a particular day.

Ian Ollmann
Vector & Numerics Group
Core OS
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Multi Channel FFT Audio Anaylzer + Question
From: Benjamin Federer <email@hidden>


Prev by Date:
Re: Passing data to my CocoaView efficiently

Next by Date:
Re: Multi Channel FFT Audio Anaylzer + Question

Previous by thread:
Bug Resolved as of when?

Next by thread:
Re: Multi Channel FFT Audio Anaylzer + Question

Index(es):

Date
Thread