Re: Question about garbage collection
Re: Question about garbage collection
- Subject: Re: Question about garbage collection
- From: John Engelhart <email@hidden>
- Date: Mon, 11 Jan 2010 21:25:21 -0500
On Sun, Jan 3, 2010 at 1:40 PM, Bill Bumgarner <email@hidden> wrote:
>
> The stack has to be conservatively scanned because the C ABI makes no
> guarantee about stack layout and the compiler is free to put values wherever
> it sees fit,
Strictly speaking, the "C" ABI does not require that a function use "the
stack" to hold temporary values (or, in C terms, variables that have an
automatic storage duration / 'auto' storage class). In fact, the C language
does not even require the use of a stack- the word "stack" does not even
appear in the standard(s).
> including overwriting values with other values as a part of optimization
> (which is the same reason why you sometimes can't print local variable's
> values when debugging optimized code).
>
This, on the other hand, is simply not true. While the "C" ABI does not
cover this, the C language standard is very specific about such things.
Section 5.1.2.3 from the C89 (ISO/IEC 9899:1990) standard states the
following:
"An instance of each object with automatic storage duration is associated
with each entry into its block. Such an object exists and retains its
last-stored value during the execution of the block and while the block is
suspended (by a call of a function or receipt of a signal)."
The C99 standard says essentially the same thing in section 6.2.4.
Optimization walks a fine line between "strict compliance with the abstract
machine semantics specified in the C language standard" and "the actual
semantics of the optimized code". Or, in the words of the C89 standard:
"An implementation might define a one-to-one correspondence between abstract
and actual semantics: at every sequence point, the values of the actual
objects would agree with those specified by the abstract semantics. The
keyword volatile would then be redundant.
Alternatively, an implementation might perform various optimizations within
each translation unit, such that the actual semantics would agree with the
abstract semantics only when making function calls across translation unit
boundaries. In such an implementation, at the time of each function entry
and function return where the calling function and the called function are
in different translation units, the values of all externally linked objects
and of all objects accessible via pointers therein would agree with the
abstract semantics. Furthermore, at the time of each such function entry the
values of the parameters of the called function and of all objects
accessible via pointers therein would agree with the abstract semantics."
It's fair to say that the majority of worthwhile optimizations result in
some amount of "strict abstract machine semantics", and so an awful lot of
effort goes in to determining whether or not such an optimization can be
done "safely". It's not uncommon for a potential optimization to be culled
from consideration because there exists an extremely remote possibility that
such an optimization could cause the effects of the abstract machine
semantics violation to become "visible" to the program. That's why function
calls are the usual demarcation point- the compiler has complete control of
the visible state (or, abstract machine semantics compliance) up to that
point, but often can't control it past the function call.
This is why the Cocoa GC system is broken when you turn the optimizer on.
The gcc optimizer has decades worth of assumptions that it is "in complete
control" of when the violations of "strict abstract machine semantics"
become 'visible' to code that is out of its control (i.e., across a function
call boundary). In a nutshell, it has a mental model that the code it's
generating operates in something akin to "single threaded mode". The GC
collector runs asynchronously (at least, for most uses of the Cocoa
collector) relative to the flow of execution for the code that the compiler
is generating / optimizing. In short, the worst case scenario is that the
collector runs after every single instruction of the optimized code emitted
by the compiler. Unless, of course, sufficient analysis by the optimizer
can /prove/ that the abstract machine violation caused by the optimization
(i.e., "reusing a stack slot") is indistinguishable in execution from the
strictly conforming version of the code. Since the collector runs
completely asynchronously relative to the "current thread" that the
generated code will be executing in, this is essentially impossible (short
of inserting some kind of GC mutex barrier or "compiler assisted safe points
(http://www.hpl.hp.com/personal/Hans_Boehm/gc/conservative.html ,
http://llvm.org/docs/GarbageCollection.html )").
For all practical purposes, this means that all __strong and __weak
qualified variables really should be "automagically promoted" to volatile
qualified (i.e, NSString * volatile myString = ...). In fact, footnote 116
of the C99 (WG14/N1256 ISO/IEC 9899:TC3) is particularly apropos with
"accessed by an asynchronously interrupting function" - '116: A volatile
declaration may be used to describe an object corresponding to a
memory-mapped input/output port or an object accessed by an asynchronously
interrupting function. Actions on objects so declared shall not be
‘‘optimized out’’ by an implementation or reordered except as permitted by
the rules for evaluating expressions.' [ed-note: The C99 definition for
'object' is different than the definition used in objc programming.]
Sorry to beat this dead horse, but the fact that the optimizer "reuses stack
slots" in a way that violates the semantics spelled out in the C language
standard makes it extremely difficult to write programs that are
"deterministically correct". At least, that's been my experience. Not so
much the "my code passes all the unit tests", but my code would make it
through some kind of super "valgrind / klee" (http://klee.llvm.org/) like
tool- a "worst case scenario simulation" where the collector is "run" after
every single instruction.
> This can cause an object to live longer than expected. In practice, this
> rarely happens, though, as run loops zero the stack -- only to the depth
> needed -- on each loop (as does Grand Central).
>
> If you are writing your own looping construct, you can call
> objc_clear_stack(...) to clear the stack at appropriate times, typically
> when the stack is as shallow as possible. Prior to Snow Leopard, writing
> your own looping construct was relatively rare in Cocoa. With Snow
> Leopard's addition of Grand Central Dispatch, writing your own looping
> construct is actively discouraged (though, certainly, there are still
> reasons to do so).
>
Can you elaborate on this? I think my idea of a "looping construct" is
different from what you're discussing because I use "looping constructs" all
the time. Heck, one of the big 10.5 features was fast enumeration. In fact,
I remember a quote about loops.. Here it is:
http://www.cs.yale.edu/quotes.html - 18. A program without a loop and a
structured variable isn't worth writing.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden