Re: optimizing compilers
Re: optimizing compilers
- Subject: Re: optimizing compilers
- From: email@hidden
- Date: Sat, 2 Feb 2002 13:34:09 -0800
I imagine you have benchmarks that support these assertions; but
more to the point, your application is a very special case. I
stand by my assertion that for the typical user who perceives
slowness in most operations on OS X, Quartz is not a factor.
On what do you base your assertion?
Well, I worked for Apple until about a year ago, and the Quartz guys
always used to say their code was really fast. :-> Seriously,
benchmarks were done that indicated this quite definitively, as I
recall. I can also testify from personal experience -- I've been
programming for OS X since it was called NeXTSTEP :->, and I have never
yet hit a case where Quartz itself produced a speed problem in any app
I've worked on (and I did a huge amount of performance-related work
while at Apple). Calling Quartz too much for no good reason, yes,
emphatically. Quartz itself, no.
And if Quartz is no factor, what are the relevant factors ?
That I couldn't tell you -- back when I was there it was still rather
a mystery, although clearly they've figured out a lot since then, as the
OS has gotten much faster. But the speculation on this list seems
pretty plausible to me -- compiler issues, view redisplay
inefficiencies, the fact that the system simply "does more" than it used
to, etc. But more, I would suspect that it has to do with many specific
problems in many parts of the code, not with a single thing. Some
things in OS X are remarkably fast -- as fast as they could reasonably
be. Right? And then some other things are remarkably slow. That
speaks to problems with algorithms in specific places, to me.
One direction I would point my finger would be NSNotification. Not
that it is slow in and of itself, necessarily (although I get the
impression maybe it is) but that it is so *convenient* that it tends to
create a specific design problem in applications that use it. When
something changes in a more traditionally designed app, only the things
that need to be recalculated, invalidated, etc. generally are, and it
only happens once. But in working on complex OS X apps, I have noticed
that it is easy to get into a situation where a "notification storm"
develops. One thing changes, and so you send a "thing changed"
notification. Some other part of the code catches that, and invalidates
its cache. Later, you find that you need a "cache invalidated"
notification, and so you add that. Sometimes one object will respond to
a change notification in a way that causes a dozen or a hundred change
notifications to be sent as a result (an array or a tree needs to send a
change message on a per-element basis, for example). And maybe those
notifications each trigger something else. The end result is a huge
cascade of invalidation and cache-clearing (often really invalidating or
clearing the same cache over and over and over again) that can chew
through *huge* numbers of cycles. I've even seen cases where the
notification storm takes several seconds to die down, or *never* dies
down, but because notifications can be delivered in a delayed fashion,
this doesn't manifest as a hang, but merely as bad performance. It is
remarkably easy to fall into this pattern of coding, and quite hard to
fix the consequences later. I've seen this many times in Cocoa; it's an
issue to be aware of. I use NSNotification in my coding a lot less than
I used to; it's a dangerous habit. The same can be true of delayed
selector performs, of course.
A similar area I'd implicate would be NSArray and NSDictionary. Not
because they are slow in themselves (I think they're very fast,
actually), but simply because they're objects at all, and because it's
very easy to get lazy and use them more often than necessary. I would
be interested to see how many of the allocated NSArrays and
NSDictionaries instantiated at any given time contain either zero or one
item. Optimizing these cases can pay off in spades. If your collection
contains no objects, it should not have been allocated at all. If it
contains only one object, and that's common in your app, you may want to
special-case that to avoid the object allocation.
Another similar area is autorelease. It's very easy to get into the
habit of using autorelease; in fact, in some cases, it's hard to avoid.
But the speed penalty is *huge*. You really *never* want to use
autorelease unless good API design makes it necessary (and sometimes not
even then). Many Obj-C programmers don't realize this, though, and in
any case it's easy to fall into this pattern, and often hard to
eliminate it (especially when using Cocoa APIs geared around always
returning autoreleased objects, a design decision I would implicate
fairly heavily in the overall system performance). Learn to avoid it,
though, especially in cases where you're handling many objects (it
obviously doesn't matter when you're talking about autoreleasing a
document window or an app delegate or something :->).
A final direction I'd point is algorithmic complexity. In the old
days, accidentally introducing an O(N**2) (or worse) algorithm into your
code would be something you'd notice immediately. But nowadays,
machines are so bloody fast that this can be easily overlooked. When
testing code, always try really big data sets to see what the
algorithmic complexity for your code is in practice. Users always use
bigger data sets than app designers ever anticipate :->.
So what all this boils down to is: programming in Cocoa on a fast
machine is a walk in the park, between easy-to-use collection classes,
NSNotification and delayed performs, autorelease, and a CPU so fast that
algorithmic flaws don't necessary bite you immediately. You need to
stay vigilant, and each time you use one of these tools that makes your
life easier, worry a little about what the consequences might be.
Ben Haller
Stick Software