Your example isn't entirely fair since you're making your C code do at
least one third more work than it needs to. However, to really
calculate the real-world improvement from AltiVec, you should use
vadd() and vsmul() from the Accelerate.framework. The C code is below:
This should run considerably faster. There's also a function called
vam() which will do exactly what you want, but it needs a third vector
containing the multiplier. If you create a third vector (say dd) and
fill it with the value 2.0, you can use vam() and your code should run
even faster.
Now, using the Accelerate framework should give you a good idea of what
to shoot for in your own code. Without getting into too much detail,
there are quite a few improvements you can make which should speed up
your own code. Here's a revised version of your C code:
// NOTE: I typed this directly into my mail client; it should work, but
I haven't tested it.
void my_vec_add(float *a, float *b, float *c, int *n) {
int i, length = *n;
for(i=0; i < length ; i += 16) {
vector float vec_a, vec_b, vec_sum, vec_c;
The relevant changes are:
1. Using vec_ld() explicitly rather than dereferencing the pointers.
At least on earlier versions of GCC, I've found that this helps the
compiler to unroll the loop and optimize a bit more.
2. Using only two vec_add()'s. In performance-critical code, you
never, ever want to use a suboptimal algorithm.
3. Miscellaneous improvements like taking out the "inline" and the
#define's, neither of which are necessary or particularly good form.
This should run a little faster than what you had before. If you
really want to make this run as fast as possible, you might have to do
some loop unrolling by hand (depending on whether the compiler does a
good enough job of it for you) and perhaps some cache hints. Honestly
though, cache hints and things like vec_ldl() are only a last resort
and will only give you a small percentage improvement.
Brendan Younger
On Sep 16, 2004, at 2:35 AM, Kyros Yakinthos wrote:
OK,
After the very nice talk about AltiVec,
I suggest the following:
Let's proceed to a small test.
Here is a small FORTRAN code.
It calculates a "stupid" sum, writes the time needed for this
calculation
and then it calculates once again the same sum by calling a C
subroutine
where AltiVec is used.
-----------------------------------------------------------------------
-----------
real,dimension(:),allocatable:: aa,bb,cc
c real(8)::t1,t2,t3,t4
double precision time, overhead,time1,time2,time3,time4
Let's compile these sources and measure the times in various machines
(G4s, G5s).
A good parameter is the ni. By changing this number, we can see what
happens
concerning the AltiVec.
(It would be nice to show the speedups when using various compilers.
I think, it is a very nice and simple test).
My experience with xlf is that (when I use the -O5 option), Altivec
can show a 2X speedup but this case is rare.
Of course a 4X speedup is a dream.
OK, I know the problem is maybe with prefetching or what else...
My results (xlf with -O5, gcc with -O3, G4 867 PB):
In another C subroutine when I was using
and extra if statement for the cases where the floats where not a
multiply
of 4, AltiVec was slower. This seems to me a logical result.
The results on a G5 are not better. (xlf) FORTRAN add, some times
shows nearly equal results with AltiVec
Unfortunately, I had to re-setup yesterday my very small cluster of
XServes and I do not have some
fresh numbers to report.
So, could anyone suggest how should I proceed?
By taking into account the cache directives?
If yes, is this also applied to G5?
By looking at Shark's messages which are very "mystic" for a simple
engineer who is
programing with FORTRAN?
Or by searching for a computer engineer to help me?
Thank you again,
Kyros
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden