• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: improving numerical applications performance
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: improving numerical applications performance


  • Subject: Re: improving numerical applications performance
  • From: "Edward K. Chew" <email@hidden>
  • Date: Mon, 17 May 2004 18:06:46 -0400

This is a complex topic, but allow me to weigh in with a number of points:

1. The first thing to do is profile your application to see which functions are taking the most time.

2. Of these functions, determine which can be handled in single-precision without overly compromising the accuracy of your results. These, then, would be prime candidates for AltiVec optimization.

3. Check to see if Apple's AltiVec-optimized libraries can handle any of the functionality you are looking for. Otherwise, you may need to write some assembly-like code using the Altivec C extensions. In most cases, however, a careful analysis of your program will reveal that only a few places require such tweaking, so the amount of code you would need to change is probably small. Automatic vectorization through compiler options, VAST, etc. may help to a degree, but you can usually do better by hand (though NOT better than Apple's libraries).

4. Consider parallelizing your code with one pre-emptive thread per processor. High-end Macs use dual-processors nowadays, so you might as well put them both to work.

5. Try to determine if your code is CPU or memory-bound. If you see only a modest improvement with AltiVec and a negligible improvement with multithreading, it is likely memory-bound. In other words, the PowerPC chip is stuck doing nothing because it is waiting to fetch something from RAM. This is a difficult problem to solve. Careful management of cache memory may help to a degree, but a change to your algorithm may be the ultimate solution. For example, if you have pre-calculated some sort of interaction matrix, consider removing the matrix altogether and calculating just the cells you need on the fly. I was shocked to discover that a 30-step calculation (including square roots) can be much faster to perform than a single array lookup (i.e. if you are missing the cache a lot with a huge data set)!

6. If AltiVec is unsuitable for your algorithm (or even if it isn't), a G5-class computer is your best bet for numerical work. It has a dual-FPU with hardware square root instructions and the like, and memory access is improved through a MUCH faster data bus. Compared to other chips of its class, it's floating-point performance is noticeably superior, while its integer performance is comparable.

-Ted
________________________________________________________________
//////////////////
// LAMONTAGNE // GEOPHYSICS LTD
////////////////// GEOPHYSIQUE LTEE
Kingston ON Canada
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.


References: 
 >improving numerical applications performance (From: Amilcar Meneses Viveros <email@hidden>)

  • Prev by Date: Preventing NSTableView hilighting
  • Next by Date: Re: disk:// and help:// security problems
  • Previous by thread: Re: improving numerical applications performance
  • Next by thread: Re: improving numerical applications performance
  • Index(es):
    • Date
    • Thread