Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: __fres() pipeline depth




On Apr 18, 2006, at 12:48 PM, Shaun Wexler wrote:

Has something changed? I recently profiled some G3 code, and NOW... Shark 4.3.2 reports that __fres() has a 14-cycle latency and/ or is not pipelined! Quite obviously the code generated by GCC 3.3 thinks this instruction has a 5-cycle latency, the same as __frsqrtes(), __fnmsubs(), etc. Can I get some clarification regarding ppc, ppc7400, ppc970 behavior of this intrinsic? I understand its accuracy, but now certain G3 versions of my vec functions appear to have huge bubbles. :\

As far as I know, fres has never been pipelined. frsqrte and vrefp have been pipelined and have latency similar to multiply. You can use the frsqrte to do a pipelined divide with a bit of ingenuity, but since it doesn't accept negative arguments, it is a bit more work. I wouldn't make any assumptions about code scheduling in GCC 3.3 with any asm in ppc_intrin.h. In my experience, it presumes it has a one cycle latency and can make some pretty astounding scheduling choices. At times, we were forced to write large segments as asms to defeat bad scheduling. Perhaps there are improvements since then that I am not aware of. My workflow switched over to GCC 3.5 then GCC 4 pretty early on to support Intel and ppc64. GCC 3.3 continued on for a while after that.


Some G3 (the later ones from IBM, but as I understand it, not all G3) deliver better than required accuracy for fres and frsqrte -- about 12 bits, better than the 8 and 5 bits they are required to have. I am not aware of a available test to determine which type of G3 that you have. The danger is of course that you will optimize your code to work correctly on the 12 bit flavor and return insufficiently refined results on older G3.

Ian
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden

This email sent to email@hidden
References: 
 >__fres() pipeline depth (From: Shaun Wexler <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.