Wow, did I ever get hammered on that little "optimization". For
those watching at home, the optimization Chris is talking about is
probably this one:
tmp = alpha * red;
remainder256 = tmp & 0xFF; // get alpha * red (mod 256)
dividend = tmp >> 8; // compute alpha * red / 256
remainder255 = dividend + remainder256; // compute alpha * red (mod
255)
dividend += ((remainder255 >= (255 + 128)) & 1) + ((remainder255 >=
128) & 1); // add 0, 1, 2 depending on how large the remainder (mod
255) is
This is provably correct and even rounds correctly. And yes, it
would be much, much easier in AltiVec.
No, there are faster ways of doing it, without branches.
Chris
Er, that code doesn't have any branches in it that I can see...
And if you know such optimizations, why not contribute to the thread
and enlighten us by posting them rather than just sounding superior
about your knowledge?
Thanks,
Keith
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden