• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Puzzling performance difference after refactor
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Puzzling performance difference after refactor


  • Subject: Puzzling performance difference after refactor
  • From: Bill Monk <email@hidden>
  • Date: Fri, 30 Jun 2006 15:13:24 -0500

(If this would be better directed to PerfOptimization-dev please let me know.)


I have here a client app which began life as pascal code, was translated to c, carbonized in CodeWarrior, moved to Xcode and now lives as a Mach-O bundleized app.


Originally, much of the code was in a single 2MB file. In CW this was workable (if not entirely to my personal taste). However, a 2MB file brings Xcode to its knees, with debugging being particularly painful. So the file was refactored into a number of smaller files.

So now Xcode no longer requires 45-90 seconds to open a file. That's good.

Here's the puzzling thing: the app built from the single file turns out to be about twice as fast as the app built from multiple files.

The Xcode project file for the refactored version is a direct copy of the older, single file version. The only difference is that the new project contains all the refactored files and headers, and the old project contains just the one large file.

I've put the build settings of the two projects side-by-side and painstakingly compared them line-by-line. No differences that I can see (except the multi-file project has require prototypes turned on, see below).

I've done a diff on the .pbxproj files from each project. There seem to be no substantive differences. (Obviously there are differences in such things as the number of files in PBXGroup and PBXSourcesBuildPhase sections, etc.)

Yet the one which builds from a single file is 2X faster.

How do I know this?

The app does some number crunching; each calculation takes a well- defined number of iterations through a hot loop, where a fair bit of work is done. If you tap the Option key, it logs the number of iterations completed and the ETA for finishing. The single-file app consistently logs 2X the speed of the multi-file version.

Maybe the logging code is broken? Well, setting aside the fact that it's identical in each version, if you let a calculation run to completion and time it with a stopwatch, the actual results are about what the logging code predicted. Example: on a 1.25GHz/1.25GB G4, the single-file app takes about 15 minutes to process a certain data set; the multi-file version, about 28 minutes.

The results are consistent across processor types; from a 500MHz G3, various single and dual-core G4s, and on a G5 dual, while the actual times of course differ, the single-file version is always about twice as fast as the multi-file version.

Sharking the hot loop for the two versions is interesting. In both, of course, there's a function, let's call it A, which takes most the time, since that's where the work is done. Function A calls B, C, and D.

Shark shows that in the slow, multi-file app, function B is taking 44% of the time spent in A.

In the single-file, fast version, Shark shows that B is statistically insignificant. There are lines near B which show values as small as 0.1% in the Self column, but in the fast version, B has no entry in the Self column.
Yet in the slow version, B shows 44%.


The only code differences between the two are these:

The single-file version declares most functions static and uses few prototypes; instead functions are located in the file leaf-first so that functions are almost always defined before they are used.

The multi-file version breaks this into about 25 .c files and their corresponding headers. All functions have prototypes, none functions declared static.

To guaran-damn-tee there are no other code differences in the hot loop, I copied and pasted the body of every function involved from the original single-file version into the multi-file version. There was no need to do this, they were already identical, but what the heck. Result: no change.

Now the app could no doubt benefit from some of Shark's suggestions. In fact I duplicated the projects and implemented a couple of the suggestions, just to see what wuould happen, In the fast, single- file version, they made a difference. In the slow version, their effect is dwarfed by the effect of the massively slower function "B".

When I first saw this problem, I figured some build setting had simply gotten flipped and that putting it back would fix things. Now I have no idea what the trouble is. If merely moving functions definitions from file to file and removing their static declarations can have this kind of effect on performance, it'll be news to me. But at this point, anything that solves it will be news to me, because I'm thoroughly stumped.

Ideas?







_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: Puzzling performance difference after refactor
      • From: David Dunham <email@hidden>
    • Re: Puzzling performance difference after refactor
      • From: Steve Checkoway <email@hidden>
  • Prev by Date: Re: XCODE choose system's EXPAT instead of my. Argh....
  • Next by Date: Re: beeping
  • Previous by thread: Re: beeping
  • Next by thread: Re: Puzzling performance difference after refactor
  • Index(es):
    • Date
    • Thread