Oh, and if shark is showing this as some kind of hotspot (and I assume
it is otherwise why are you bothering with it) then it's simplicity to
altivec this code. All the work can be done in two loads, a sub, a madd
and a store.
Not sure you'd get a lot of speed up because you're probably memory
bound, ... and you'd need to pad your structure out to 16-bytes (add a
unused component to your 'vector' object), ... and make sure they were
aligned, ... and work on more than one object at a time to give
yourself enough work to hide the latencies, and ...
Anyway I'm just having fun. Just be sure to stop at 'fast enough'.
Paul
Attachment:
PGP.sig Description: This is a digitally signed message part
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden
This email sent to email@hidden