Thank you Ted for the detailed research.
The result is very impressive. So far the part calculating Coulombic
force is not SIMD vectorized, but it would be good time to vectorize
and re-implement with SIMD orders.
The code I posted is aimed to be fast even lacking the accuracy,
which is you pointed in the last post.
If you want to be more accurate with the code, the Newton-Raphson
step should be added more.
Off course the adding leads to be slow.
On 2008/03/29, at 15:31, Edward K. Chew wrote:
On 28-Mar-08, at 8:40 PM, ABE Hiroshi wrote:
On 2008/03/29, at 6:00, Eric Postpischil wrote:
union { float f; int32_t i; } u = { x };
// Now u.i is a 32-bit integer containing the encoding of
floating-point number f.
This is very useful method instead of casting. Thank you.
I'd be a little careful with this as well. I got into an argument
about using unions to convert types this way over on the
performance list, and as I recall, I lost. :-( Apparently, the C/C+
+ standard does not guarantee that f and i will always map onto the
same memory as written above. That said, every compiler I have
ever used (including gcc) seems to do this.
For my curiousity, I correct the code using union as;
template<class TYPE> TYPE InvSqrt( TYPE x0 ) {
volatile float x = (float)x0;
union { float x; int32_t i; } u = {x};
float xhalf = 0.5*x;
u.i = 0x5f375a86 - (u.i>>1); // gives initial guess y0
x = u.x;
x = x*(1.5-xhalf*x*x); // Newton step, repeating increases
accuracy
x = x*(1.5-xhalf*x*x); // Newton step, repeating increases
accuracy
// printf("InvSqrt: %dl 1./sqrt(%e) = %e\n",u.i,x0,x);
return (TYPE)x;
}
The code works fine with highest optimize -Os on both PPC and Intel
Mac. The results also seems to be correct.
Even though, I will forget about this and take the SIMD way.
Thank you so much.
ABE Hiroshi
from Tokorozawa, JAPAN
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden