Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: altivec/velocity engine examples



On 14 Σεπ 2004, at 20:16 , Jay A. Kreibich wrote:

On Tue, Sep 14, 2004 at 06:40:32PM +0300, Kyros Yakinthos scratched on the wall:

Is it finally worth to program using ALtiVec in a FORTRAN code by
calling C subroutines?

As a professional software engineer, my answer would be "no," but there is a lot in that answer-- I might not even be answering the question you're asking.

  If you are asking about using existing libraries or frameworks, such
  as Apple's Accelerate framework (which contains vector optimized
  versions of vDSP, vImage, BLAS, LAPACK, vMathLib, and BigNumb)
  then I would say, "yes!!!"  If any of Apple's libraries does what you
  need, it is likely worth the trouble to stub the libs out in C to
  FORTRAN and/or link against them.  They have both vectorized and
  non-vectorized versions of most of the calls, so they'll run on G3s
  as well as G4s and G5s as required.  You really don't need to think
  or care about if the call you are making is vectorized-- you can just
  be reasonably confident that the library will get whatever you want
  done as fast as it can given the current processor and datatypes--
  including future hardware.  No need to re-invent the wheel.

  If you are looking at auto-vectorization tools, I would say a much
  less enthusiastic "yes," or even just a "maybe." If the tools aren't
  real expensive, they are worth giving it a shot, but you should
  understand what you have (or don't have) to gain so you can look at
  the cost and the result and see what works for you.  For many these
  tools are a gift from the heavens; for others they only offer
  disappointment.


On the other hand, if you are asking about hand-coding operations on
the vector unit (I assume you are), I would seriously question this
practice. Vector programming is very tricky. It is not a simple
"array processor"... how you pack your data into vector units is very
critical to performance and you have to understand a lot of the low
level details to wrap your mind around how the vector unit was
designed to be used. Even if you're writing in C or FORTRAN you need
to *think* in individual assembly instructions; that also means knowing
your tools and systems will enough to know how and why specific
program statements are compiled into machine instructions. Doing
this kind of thing well is extremely difficult, just as hand-coding
instructions for the G5s dual FPUs would be extremely difficult. I
would never attempt it without the PowerPC-970 Instruction Reference
Manual on the desk next to me. If you've never looked at a processor
reference manual, save yourself and don't start. Most CS undergrads
have never looked at one (although most CE or EE undergrads have!).


  For the bench-scientists, researcher, and/or engineer, this kind of
  very low-level mucking about is very very rarely worth the effort.  I
  assume most of the people on this list are scientists or engineers
  first and computer programmers second.  This is a good thing
  (actually, it isn't.  I'd rather you guys were computer programmers
  "tenth" or some larger number, but that's a different story).  The
  computer is simply a means to an end, not an obsession in itself.
  Spend your time doing good research, not fighting compilers.

Faster code may lead to faster and better research, but consider
this. The ideal vector code will, at best, give you 2x the performance
over the ideal non-vector code on the G5 (assuming single-precision
floating point; double-precision can't be vectorized; best-case
integer performance may be higher). One could also make a strong
argument that it is easier to write "good" non-vector code than it is
to write "good" vector code, effectively making that 2x even smaller.


  If all you want is 2x performance, go buy another machine.  It is
  likely to be much cheaper than the people-time to make the code
  faster by hand-vectorizing it.  Even if that requires rewriting
  sections to allow distributed computing, this is time spent that is
  more worth the effort.  At least distributed versions typically scale
  past two.

  OK, I'll admit that "buy more machines" isn't an option if you
  already have a 1000 node cluster since another 1000 machines will
  pay for a *lot* of programming time (I'll trade you!).  On those
  kinds of scales, it is an individual call.




Everyone's situation is different, and there are times when cost/performance is outweighed by raw performance. Just understand the high costs of this kind of work, and the rather slim results even if you do a great job. That said, there is no reason not to take advantage of it if you can-- the Accelerate libs from Apple make that easy and can reduce a lot of other programming work. They'd be highly desirable even if they weren't vectorized. Throw in the IBM compilers, which are fairly inexpensive next to the programming time they can save, and you're fairly well off. But tweaking the vector pipeline by hand is high wizardry.

   -j



Jay,

I think I am 100% satisfied from your answer.
It is the most clear or maybe cynic answer that I ever had concerning with
what finally happens with the AltiVec story and the simple scientific computing.
Thank you for the enlightenment that you've offered to me.


A mechanical engineer who is fighting with compilers, vectorizers, AltiVecizers, etc. etc. etc.

Kyros



_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden

This email sent to email@hidden
References: 
 >altivec/velocity engine examples (From: "Sean C. Garrick" <email@hidden>)
 >Re: altivec/velocity engine examples (From: "Jay A. Kreibich" <email@hidden>)
 >Re: altivec/velocity engine examples (From: Kyros Yakinthos <email@hidden>)
 >Re: altivec/velocity engine examples (From: "Jay A. Kreibich" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.