On Sun, 27 Nov 2005, Ian Ollmann wrote:
[...]
> You can also do it in one instruction with vec_mladd, if you'd rather
> use the VCIU:
>
> v = vec_mladd( v, (vector signed short) (1, -1, 1, -1, 1, -1, 1,
> -1), (vector signed short) (0) );
>
Much too obvious to occur to me. I mean, who in their right minds would
think of multiplying with -1 when a sign switch is needed? :-)
BTW, the multiply-adds are especially useful on G4+, because that nifty
little processor can issue any pair of vector instructions in parallel
(not just permute + something else, like G4 and G5). So the various
multiply-adds can occasionally serve as stand-ins for an add or a shift,
potentially doubling throughput.
(Or for a shifted add, or a bitfield insertion in case the target position
is known to consist of zeros. Or for masking out some elements by
multiplying with zero/one (potentially merging other elements with the
add step). And probably more that I haven't thought of. Often used as one
half of a "two instruction miracle".)
Holger
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden
This email sent to email@hidden