Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: altivec/velocity engine examples



> Message: 6
> Date: Tue, 14 Sep 2004 10:23:50 -0500
> From: "Sean C. Garrick" <email@hidden>
> Subject: Re: altivec/velocity engine examples
> To: Todd Dailey <email@hidden>
> Cc: Apple Scitech Mailing List <email@hidden>
> Message-ID: <email@hidden>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> Todd:
> 
> I've looked at that many times and its not really helped me.
> Any real-world examples? Any before/after code snippets?
> 
> Thanks!
> Sean
> 
> 
> ------------------------------
> 
> Message: 7
> Date: Tue, 14 Sep 2004 18:40:32 +0300
> From: Kyros Yakinthos <email@hidden>
> Subject: Re: altivec/velocity engine examples
> To: email@hidden
> Cc: Apple Scitech Mailing List <email@hidden>,    "Discussion
> list for clustering Apple server technologies \(previously
> clusters\)." <email@hidden>
> Message-ID: <email@hidden>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> I would like to add one more question to this one posted by Sean:
> 
> Is there anyone that has found nice speedups when he used AltiVec in a
> FORTRAN code
> especially if he was using IBM's xlf?
> 
> Is it finally worth to program using ALtiVec in a FORTRAN code by
> calling C subroutines?
> 
> Kyros
> 
> 


So I don't bore the old timers with another repeat war story, anyone
interested can search the archives for "Jet3D" (I'd search but the new list
serv requires a password and I can't remember mine at the moment).  I am
fairly certain I have talked about my previous experience with FORTRAN and
AltiVec in Jet3D, which is a typical CFD postprocessor code.  I probably
even included examples.  I got about 5-9X speedup depending on how you count
your beans, all from replacing 10-12 lines of FORTRAN code (in my innermost
loop) with a few calls to C subroutines containing AltiVec instructions and
vecLib calls.  There are probably less than 40 lines of C code total.  At
the time I didn't know C and was new to vector programming, so if I could
figure it out, anybody can.  It is not too hard to roll your own vector code
in C, but there are a few important rules to follow, most especially keeping
data aligned properly (F77 malloc and F90 allocate will take care of this
for you).  In my particular case I got a huge payoff because the
vectorization implemented 4-way parallelism in the kernal of a nested loop,
and the speedup really compounded.  I would heartily recommend anyone in the
same situation take a serious look at AltiVec.

Craig

-- 
Dr. Craig Hunter
NASA Langley Research Center
AAAC/Configuration Aerodynamics Branch
email@hidden  (new!!)
(757) 864-3020
(Dual G4 - OS X)

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden

This email sent to email@hidden



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.