Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: loop unrolling and AltiVec register utilization



I'm not sure how to do that sum at the end nicely though. Maybe somebody else has a suggestion.

No longer too sure where I got that, but I think it was in a piece of assembly from Apple OpenGL team. I am not sure it qualifies as "nicely", see the comments at the end.


so, assume diS contains the 4 values you want to add at the end of those instructions:

diS = vec_madd(cv2, di2, diS); // diS loaded last cycle - 3 Cycle Stall
diS = vec_madd(cv3, di3, diS); // diS loaded last cycle - 3 Cycle Stall

then you can use vsldoi to shift diS.abcd into into a temporary register diS.bcda


then you add the two registers, which gives diS.(ab)(bc)(cd)(da)

then you somehow repeat the process by reshifting this new result using vlsdoi, so you have diS.(cd)(da)(ab)(bc)

then you add the original sum with its shifter version

diS.(ab)(bc)(cd)(da) + diS.(cd)(da)(ab)(bc) = diS.(abcd)(abcd)(abcd)(abcd)

the net result is a vector where all 4 components are the sum of the 4 original components. Variants of the same system can be used to sum more dimensions, or less, you mostly have to get used to vsldoi.

In a less verbose format:

temp = vec_vsldoi(diS, diS, 4);
temp = vec_add(temp, temp);
temp = vec_vsldoi(temp, temp, 8);
sum = vec_add(temp, temp);

However, the problem with this is again immediate dependancies for sequential instructions. So it's best done in a larger block of code, intertwined with other independant calculations.

If you have a larger set of dimensions to calculate and add, you can postpone the above "mixing" until the end, by simply adding the intermediate vectors together into an accumulator, then adding the subelements of that accumulator only when done.

Ludo

_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden

This email sent to email@hidden
References: 
 >loop unrolling and AltiVec register utilization (From: Stan Jou <email@hidden>)
 >Re: loop unrolling and AltiVec register utilization (From: Paul Sargent <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.