On Thu, Sep 10, 2009 at 5:33 PM, Jens Alfke<email@hidden> wrote:
Off-list I suggested to an Apple engineer that the compiler should
optimizing away stack storage of local variables that point to Obj-C
objects, when GC is enabled. The response was that this had been
investigated but led to too much heap growth (i.e. too many actual
objects being held onto by unused local variables.)
I'd like to find out more about this: I'd have assumed that the
problem was shortage of registers rather than unused local variables.
Surely the heap growth is considerably less than with autorelease?
It was investigated, but not ad nauseum. It ends up leading to two
(1) stack growth & inefficient stack use
(2) Objects that stick around a lot longer than they should because of
the lack of the compiler reusing their slot on the stack.
(1) is a general problem and you can see it by simply comparing heap
sizes of debug vs. non-debug executables. There are certain
pathological patterns -- buffers on the stack used at the beginning of
a function only (common with, say, char buf[MAXPATHLEN] -- that cause
the stack to grow sometimes a lot more than the optimized version.
(2) This is a bigger problem. Namely, a reference to some object
sticks around on the stack and that ends up rooting a potentially very
large graph of objects, sometimes pretty much all objects in the
This second problem was also why NSRunLoop and GCD will zero the stack
down to the lowest depth ever scanned. Because the stack is scanned
conservatively -- every pointer aligned slot is scanned as if it might
contain a pointer -- not zeroing the stack can lead to objects
sticking around for way too long.
A particularly unfortunate form of this problem is when you have a
bunch of pointers on the stack in func B, called by func A. B then
returns and A calls C which promptly does 'int bunchOfStuff', but
doesn't zero it. Since the stack is scanned conservatively the
pointers from A that are now unwittingly sitting in bunchOfStuff will
be rooted and will stick around.
Doesn't happen that often and typically isn't a big deal.
The lack of the compiler optimization that reuses stack slots can lead
to the same issue. It doesn't happen often, but it can happen in
critical spots. For example, if I recompile the AppKit as a non-
optimized build, there are certain objects at the beginning of the run
loop that'll stick around pretty much forever.
So, yes, externally to the interior pointer issue, that particular
compiler optimization actually makes the collector more efficient.
Obviously, it would be desirable to exactly scan the stack. However,
the compiler is pretty much free to do whatever it wants to the stack
however it sees fit [within the bounds of the level of spec support
desired -- I don't think any compiler exactly and completely
implements even the C99 spec, much less C++... could be wrong, though].
Not sure what else to suggest. It seems there needs to be a way to
explicitly mark that you're keeping a reference to an object, since
declaring a local variable pointing to it isn't sufficient. Maybe
of "__dont_optimize_away" annotation on the local variable?
What about some sort of annotation on the return type of the -bytes
method which would cause the receiver to be added to a temporary root
// compiler inserts __add_to_temp_root_subset(myData) here
// on account of @associated method having been called
// .... lots of stuff, myData never referenced again
// compiler inserts __empty_temp_root_subset() here
// on account of this being the closing scope of
// the code which called the @associated method
Any object which vends an interior pointer, or indeed any resource
whose lifetime is tied to the object, could declare the vending method
this way. It's backwards compatible and prevents re-use of stack
storage exactly when needed rather than all or none of the time.
(Of course, the "optimizing p[i] into *p++" problem would still have
to be addressed, but it's a different issue, only conflated with the
NSData problem as a result of an implementation detail.)
That is a possibility. There are a really a couple of very distinct
(1) Deal with the handful of methods that return C pointers. This has
actually largely been dealt with via declarations like:
And implementations that use NSAllocateCollectable(), etc...
(2) Deal with the whole NSData/bytes debacle
Some kind of annotation may be required as per suggestions like the
(3) Deal with the *p++ type problem.
This is more difficult. In particular, it matters if p was allocated
in scanned or unscanned memory. Care also has to be taken when
potentially colluding runtime & compile time behavior. For example,
the compiler should not be taught about -bytes or NSData or
Do not post admin requests to the list. They will be ignored.
Objc-language mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden