Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: altivec/velocity engine examples



> I got about 5-9X speedup depending on how you count
> your beans, all from replacing 10-12 lines of FORTRAN code 

> In my particular case I got a huge payoff because the
> vectorization implemented 4-way parallelism in the kernal of a nested loop,
> and the speedup really compounded.  I would heartily recommend anyone in the
> same situation take a serious look at AltiVec.

  This brings up a few points that perhaps I should have made more
  clear in my first "rant."  I realize I came across a bit more
  strongly then I really meant.

  First off, if you have a simple situation, it can't hurt to try.  If
  all you do is spend an afternoon trying to get it to work, even a small
  pay-off will justify the time.  My main point was that people often
  seem to have the strange obsession about performance and let it blind
  them to the simple economics of the situation (open-source programmers
  are some of the worst at this).  They'll happily let themselves get
  dragged into an optimization fight for weeks that, in the end, leads
  to some fairly minor improvements.  They'll come back after a month
  and say, "Look, it runs 10% faster!" and I'll say, "That's great.
  Moore's Law did better in the same amount of time."
  
  These kinds of battles are just not worth the time and stress unless
  your running on a multi-million dollar machine or have some other
  unusual situation that justifies the time and cost.

  Giving it a basic try on a simple program (or simple loop) will,
  however, occasionally produce results well worth the time and effort.
  Where you get into trouble is when you have to redesign a large
  existing FORTRAN program that is already stable and known to work
  correctly; it often just isn't worth the effort.  The time it takes
  to rerun the test (for those rare programs that *have* tests) to
  verify correctness can be significant, depending on the complexity of
  the application.  You just have to be smart about the true cost of
  getting the modifications to work and the returns those modification
  will give you.

  Another thing that might not have been clear is that the advantages
  of the AltiVec when dealing with 32-bit floating point numbers on the
  G5 are not awesome -BUT- there is a much bigger difference on the G4.
  If you need your code to run on G4 systems (e.g. Powerbooks), it might
  be worth a little extra time to look at the vector stuff.  There are
  much bigger improvements available on that platform.  The same is
  true with integer stuff on the G4 and the G5.  If you're doing DSP
  like algorithms on 8 or 16 bit int data, the vector stuff can provide
  *huge* improvements.  
  
  Nearly all of the work I've done in labs (mostly low-energy physics)
  was strictly double-precision floating point.  It is sometimes hard
  to remember that people still do interesting stuff with integers.  8-)

  Also, as someone that usually works on large software systems where
  long-term maintenance and support are key concerns, I'm usually a bit
  pessimistic about departing from the norm.  That's why I like the
  Accelerate libs-- they give me all the performance a specific
  platform can give without any of the work or trouble.  I don't need
  to special case G3s or anything like that.  I also like having the
  higher level APIs available that means less work for me.  I'm also
  often working with improving large systems that don't have easy or
  obvious bottlenecks.  Applications often require reworking whole
  computational sections, not just a loop here or a step there. These
  are not the kind of things you want to dive into lightly and start to
  tear apart.  Small tight loops or single steps are prime candidates
  for this kind of key-hole optimization.  Unfortunately (for me,
  anyways), by the time I seem most code these kinds of easy steps
  have already been done.
  
  I've also worked in enough research institutions to know that isn't
  how all the world works.  If you do want to take a crack at the
  vector stuff, the best advice I can offer is to read everything on
  Apple's site about the AltiVec unit.  As others have said, memory
  alignment is key to getting any kind of performance out of the
  system. Also, before you start anything RUN SHARK!!!  I can't stress
  this enough. No matter how sure you are of the bottleneck, profile
  the code so that you know EXACTLY where the problem is.  Don't waste
  your time fixing what isn't broken.  Shark is an **amazing** tool,
  and it comes free with the developer tools. Take advantage of it.

  Finally, I want to apologize to anyone that got the impression that I
  was putting down their 'l33t h4ck3r ski11z. While it may be safe to
  assume that most of you get the most enjoyment out of your primary
  profession (and computer programming isn't it), I never meant to imply
  that none of you can develop software.  The vector stuff *is* a bit
  tricky, and you need to keep all your Is dotted and Ts crossed, but
  someone that has a good grasp of how computers work beyond FORTRAN or
  C itself *can* do this.  It may be wizardry, but it isn't impossible.

  That said, there are a lot of people in academia (well, and industry,
  for that matter) that have no concept how much skilled development
  time (by anyone!) costs.  No, price/performance isn't everything, but it
  is a starting point.  If you want to cross that line and go for a
  finely-tuned, well-oiled program that just screams, go for it!  Just
  be aware of the decisions your making and why you're making them.
  Then it just becomes an engineering compromise.

  Best of luck to everyone,

   -j

-- 
                     Jay A. Kreibich | Comm. Technologies, R&D
                        email@hidden | Campus IT & Edu. Svcs.
          <http://www.uiuc.edu/~jak> | University of Illinois at U/C
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden

This email sent to email@hidden

References: 
 >Re: altivec/velocity engine examples (From: Craig Hunter <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.