best libm time for 1000 calls: 3.06011e-06 seconds (792544)
Hmm... that time suggests 6 cycles for a atan, which I think is not
reasonable. I probably got defeated by a compiler optimization here
somewhere...
Alrighty, once the compiler is defeated, we see that the speed
improvement is there:
ollmia:/tmp iano$ gcc -O3 main4.c -Wmost
ollmia:/tmp iano$ ./a.out
best libm time for 1000 calls: 0.000113 seconds (801939)
best cheesy time for 1000 calls: 0.000017 seconds (800454)
My vectorized version of atan2 that I wrote for MacFOH performs the
full rectangular-to-polar conversion, including UNWRAPPED normalized
phase and magnitude in decibels, and profiles 30x faster than
(float)atan2(y,x) with floats. The atan2f "cheesy" portion of the
vectorized code is 24x faster than libm.
--
Shaun Wexler
MacFOH http://www.macfoh.com
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden
This email sent to email@hidden