Re: Garbage collector vs variable lifetime
Re: Garbage collector vs variable lifetime
- Subject: Re: Garbage collector vs variable lifetime
- From: John Engelhart <email@hidden>
- Date: Mon, 9 Jun 2008 03:56:03 -0400
On Jun 8, 2008, at 11:48 PM, Chris Hanson wrote:
On Jun 8, 2008, at 5:39 PM, John Engelhart wrote:
On Jun 7, 2008, at 7:11 PM, Chris Hanson wrote:
This won't happen because each message expression -- just as with
function-call expressions -- is a sequence point. The compiler
can't know what side-effects [data self] might have, so it can't
re-order the invocation to elsewhere.
This is not necessarily true. If the const and pure GCC
__attribute(())s were extended to objc methods then the compiler
would be free to perform common subexpression and loop invariant
code movement optimizations.
They can't be, while preserving the existing semantics of the
language. In the existing semantics, a message send always results
in a dynamic dispatch.
This is true for C as well. The semantics of the C language are such
that a function being called in the source always results in one
function call during execution.
Then consider the case of:
int square(int x) { return(x*x); }
int something(int y) {
int r = 0;
for(int z = 0; z < 25; z++) { r += square(y); }
return(r);
}
An inter-procedural optimizing compiler will eliminate the function
call to square and replace it with (y*y). It will also determine that
(y*y) is loop invariant and hoist it out of the loop body.
The semantics are preserved and identical results are calculated (the
'meaning' is the same). The semantics do not require square() to
literally be called each time. In the same sense, there is no
requirement that a message literally be sent each time it appears in
the source code as long as it can be shown that the results would be
identical. Identical results implies identical meaning, which in turn
means semantics are preserved.
In the case of [data self], this essentially becomes the function:
id self(id self, SEL _cmd) { return(self); }
A sufficiently aware optimizing compiler (think every single line of
source for everything as one multi-gigabyte translation unit) could
hypothetically trace message dispatches such that it could eliminate
the intermediate dynamic dispatch and be left with just the function,
such as:
{ NSData *data = /* valid ptr */; /* do some work */ self(data,
"self"); self(data, "self"); }
The compiler would be free to eliminate both function calls.
Semantics would be wholly preserved: The self function causes no
program visible state changes, therefore by definition its execution,
or lack of execution, can not have an effect.
The run time dynamic dispatch nature of objc makes such 'inter-message
dispatch optimizations' much, much harder, especially at compile
time. Ultimately, though, they are fundamentally the same in terms of
optimization.
The 'self' message would definitely fall under the domain of these
attributes, thus the original argument is apropos.
For source compatibility, you almost certainly could *not* suddenly
indicate that "[foo self]; [foo self];" results in only one
invocation of -self by the compiler, at least for subclasses of
NSObject or NSProxy.
After all, a subclass of NSObject may have overridden -self to do
something else, and the compilation unit containing the above two
invocations may have no idea what the class of "foo" is with which
to make that judgment.
Nonsense. Look, these kinds of attributes are a lot like
typecasting. You can typecast away const, volatile, or whatever and
even stuff a short in to a pointer and vice versa. That doesn't make
it right. The typecast overrides the compilers safeties, you
essentially certify that your typecasted result is true and correct
despite what the rules say. If you end up shooting yourself in the
foot, you only have yourself to blame.
Besides, it's not that hard to come up with some simple additions for
when an attribute like 'const' or 'pure' can legitimately be applied
by the compiler within the context of objc objects. An obvious one
would be that the attribute only applies to the class it was declared
for and nothing else, even subclasses.
Obvious candidates are immutable objects 'length', 'count', etc,
which would result in a pretty big win if these attributes were
available.
If I write
- (void)doSomething:(NSArray *)array {
NSUInteger count1 = [array count];
NSUInteger count2 = [array count];
NSLog(@"%u", count1);
NSLog(@"%u", count2);
}
the compiler can't collapse those into a single invocation of -
count. After all, it could be passed a subclass of NSArray for whom
-count has side-effects. Think about a "thread-safe array" (as bad
as the concept might be) for example.
Well, in the case of your example, you have a bug: You have statically
typed the class to NSArray, not your subclass. If one applies the
'attribute only applies to the class it was specified for' rule:
By statically typing the class to NSArray, you have certified to the
compiler that no other object type will be received as an argument.
When you passed it an object of a different type, even a subclass, you
broke your promise to the compiler.
If the declaration is switched to 'id', then by definition none of the
methods will be tagged with the attribute, and thus things would work
as expected (two calls to -count).
If the declaration is switched to 'MyThreadArray *' and one doesn't
supply a new method prototype, the attribute is lost because it's a
different class (result: two calls to -count). If one does supply a
new 'count' prototype without any __attribute qualifications, then
things will work as expected (result: two calls to -count). If there
is a new 'count' prototype and it is defined with __attribute((const))
when it isn't, then you've only yourself to blame really.
Again, the const and pure attributes are nothing that a 'hyper aware'
optimizing compiler wouldn't be able to figure out on its own. Even
if one takes the conservative stance that all existing objects and
methods won't be modified with const or pure attributes, I would still
be free to apply them to the objects I create. If I had a custom
class of an object, and altered its prototype for self so that it was
__attribute((const)), then we've really only delayed the problem, not
fixed it.
An alternative, if the whole __attribute__ thing gives you the
willies, is to consider 'If the compiler had sufficient information at
its disposal, would it ultimately reach the same conclusion?' Even
run time dependencies are potentially free game if one considers LLVM
taken to its logical conclusion, which is to say deferring final
compilation and optimization until run time.
The case of '[data self]' is kind of an odd ball. It's pretty obvious
that it's unlikely to do anything 'useful'. Using our all powerful
'brain optimizer', it's trivial for us to trace the appropriate code
paths and come to the conclusion that in this particular case, nothing
is accomplished by calling '[data self]'. If we can do it, then
hypothetically a compiler could do it too.
I think what all of this is really trying to accomplish is a
fundamental deficiency in treating __strong pointers as equivalents to
generic void * pointers. What should really be done is change the
visibility rules of __strong pointers such that 'they are guaranteed
to be visible to the GC system from the point of declaration to the
end of the enclosing block.' Machinations like sticking '[data self]'
near the end so the pointer stays visible up to that point and the
possible effects of optimization on such visibility become moot under
such a definition.
Adding additional attributes to make any new API contracts stricter
is an interesting idea, but it's likely to result in breakage of
existing code either at runtime or during compilation.
-- Chris
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden