• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: are static operations optimized?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: are static operations optimized?


  • Subject: Re: are static operations optimized?
  • From: Rick Altherr <email@hidden>
  • Date: Tue, 27 Jan 2009 10:34:30 -0800


On Jan 27, 2009, at 9:22 AM, Chris Williams wrote:

Virtually all applications I’ve seen that are “slow” are slow not because of CPU time, but because of other operations (such as disk and network accesses) that are orders of magnitude more expensive.  Or because they needlessly do operations many times where once would do.


This is true for many traditional desktop applications (word processing, spreadsheets, etc), but not true for scientific applications where often the working set fits the prefetching behavior and cache size well and thus the CPU is the bottleneck simply by virtue of how long the operations take to execute.

Certainly you know more about the specifics of your application and its larger objectives than a compiler can know.  And therefore you CAN write code that is faster in the individual case than a compiler.  That is quite true.

But I have said this before, in 25+ years in the software business I’ve seen precisely one case where someone made huge performance gains in an application by hand-coding or tricking the compiler, and that was almost 20 years ago when the person wrote code that just fit in the 80286 pre-fetch cache.


You'd be surprised at how often that same trick is used today.  This is also why things like cache affinity in the scheduler have become important.  Not only is the code being written to fit in the cache, but the data is being partitioned to be used by threads cooperatively so that multiple cores can share cache lines.

In every other case, the return on time investment for this kind of stuff is tiny (or negative) and serves merely to amuse the inner-geek in the coder and not the larger case of making the application faster.


Look at any large scientific application that has no user interface and does nothing but number crunching.  An improvement in the computational kernel improves the execution time significantly.  Most of the time, this involves manually scheduling the code or using processor features that the compiler doesn't use.  This is why many math libraries use assembly and have variants for each processor microarchitecture.

In any case, profile first.  Optimizing a function that takes 1% of the execution time of an action will at most improve overall performance by 1%.  For performance work, intuition is frequently wrong.  Use the tools available to you (Shark, Instruments, DTrace) and collect information showing what the bottleneck is and then tackle it.

</soapbox>  :)

--
Rick Altherr
Architecture and Performance Group
email@hidden


Attachment: smime.p7s
Description: S/MIME cryptographic signature

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: are static operations optimized?
      • From: Chris Williams <email@hidden>
References: 
 >Re: are static operations optimized? (From: Chris Williams <email@hidden>)

  • Prev by Date: Re: are static operations optimized?
  • Next by Date: Re: are static operations optimized?
  • Previous by thread: Re: are static operations optimized?
  • Next by thread: Re: are static operations optimized?
  • Index(es):
    • Date
    • Thread