Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Garbage collector vs variable lifetime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Garbage collector vs variable lifetime

Subject: Re: Garbage collector vs variable lifetime
From: Jean-Daniel Dupas <email@hidden>
Date: Mon, 9 Jun 2008 10:11:44 +0200


Le 9 juin 08 à 09:56, John Engelhart a écrit :

On Jun 8, 2008, at 11:48 PM, Chris Hanson wrote:
On Jun 8, 2008, at 5:39 PM, John Engelhart wrote:
On Jun 7, 2008, at 7:11 PM, Chris Hanson wrote:
This won't happen because each message expression -- just as with function-call expressions -- is a sequence point. The compiler can't know what side-effects [data self] might have, so it can't re-order the invocation to elsewhere.
This is not necessarily true. If the const and pure GCC __attribute(())s were extended to objc methods then the compiler would be free to perform common subexpression and loop invariant code movement optimizations.
They can't be, while preserving the existing semantics of the language. In the existing semantics, a message send always results in a dynamic dispatch.
This is true for C as well. The semantics of the C language are such that a function being called in the source always results in one function call during execution.
Then consider the case of:
int square(int x) { return(x*x); }
int something(int y) {
int r = 0;
for(int z = 0; z < 25; z++) { r += square(y); }
return(r);
}
An inter-procedural optimizing compiler will eliminate the function call to square and replace it with (y*y). It will also determine that (y*y) is loop invariant and hoist it out of the loop body.

The semantics are preserved and identical results are calculated (the 'meaning' is the same). The semantics do not require square() to literally be called each time. In the same sense, there is no requirement that a message literally be sent each time it appears in the source code as long as it can be shown that the results would be identical. Identical results implies identical meaning, which in turn means semantics are preserved.
In the case of [data self], this essentially becomes the function:
id self(id self, SEL _cmd) { return(self); }

How do you know at compile time that that will always be the case ? How do you know that their will not have some categorie that override self and change the implementation ? Even without category, it's perfectly possible to change the implementation pointer of the Method self using runtime functions.

A sufficiently aware optimizing compiler (think every single line of source for everything as one multi-gigabyte translation unit) could hypothetically trace message dispatches such that it could eliminate the intermediate dynamic dispatch and be left with just the function, such as:

{ NSData *data = /* valid ptr */; /* do some work */ self(data, "self"); self(data, "self"); }

The compiler would be free to eliminate both function calls. Semantics would be wholly preserved: The self function causes no program visible state changes, therefore by definition its execution, or lack of execution, can not have an effect.

The run time dynamic dispatch nature of objc makes such 'inter- message dispatch optimizations' much, much harder, especially at compile time. Ultimately, though, they are fundamentally the same in terms of optimization.
The 'self' message would definitely fall under the domain of these attributes, thus the original argument is apropos.
For source compatibility, you almost certainly could *not* suddenly indicate that "[foo self]; [foo self];" results in only one invocation of -self by the compiler, at least for subclasses of NSObject or NSProxy.

After all, a subclass of NSObject may have overridden -self to do something else, and the compilation unit containing the above two invocations may have no idea what the class of "foo" is with which to make that judgment.
Nonsense. Look, these kinds of attributes are a lot like typecasting. You can typecast away const, volatile, or whatever and even stuff a short in to a pointer and vice versa. That doesn't make it right. The typecast overrides the compilers safeties, you essentially certify that your typecasted result is true and correct despite what the rules say. If you end up shooting yourself in the foot, you only have yourself to blame.

Besides, it's not that hard to come up with some simple additions for when an attribute like 'const' or 'pure' can legitimately be applied by the compiler within the context of objc objects. An obvious one would be that the attribute only applies to the class it was declared for and nothing else, even subclasses.
Obvious candidates are immutable objects 'length', 'count', etc, which would result in a pretty big win if these attributes were available.
If I write
- (void)doSomething:(NSArray *)array {
   NSUInteger count1 = [array count];
   NSUInteger count2 = [array count];
   NSLog(@"%u", count1);
   NSLog(@"%u", count2);
}
the compiler can't collapse those into a single invocation of - count. After all, it could be passed a subclass of NSArray for whom -count has side-effects. Think about a "thread-safe array" (as bad as the concept might be) for example.
Well, in the case of your example, you have a bug: You have statically typed the class to NSArray, not your subclass. If one applies the 'attribute only applies to the class it was specified for' rule:

By statically typing the class to NSArray, you have certified to the compiler that no other object type will be received as an argument. When you passed it an object of a different type, even a subclass, you broke your promise to the compiler.

If the declaration is switched to 'id', then by definition none of the methods will be tagged with the attribute, and thus things would work as expected (two calls to -count).

If the declaration is switched to 'MyThreadArray *' and one doesn't supply a new method prototype, the attribute is lost because it's a different class (result: two calls to -count). If one does supply a new 'count' prototype without any __attribute qualifications, then things will work as expected (result: two calls to -count). If there is a new 'count' prototype and it is defined with __attribute((const)) when it isn't, then you've only yourself to blame really.

Again, the const and pure attributes are nothing that a 'hyper aware' optimizing compiler wouldn't be able to figure out on its own. Even if one takes the conservative stance that all existing objects and methods won't be modified with const or pure attributes, I would still be free to apply them to the objects I create. If I had a custom class of an object, and altered its prototype for self so that it was __attribute((const)), then we've really only delayed the problem, not fixed it.

An alternative, if the whole __attribute__ thing gives you the willies, is to consider 'If the compiler had sufficient information at its disposal, would it ultimately reach the same conclusion?' Even run time dependencies are potentially free game if one considers LLVM taken to its logical conclusion, which is to say deferring final compilation and optimization until run time.

The case of '[data self]' is kind of an odd ball. It's pretty obvious that it's unlikely to do anything 'useful'. Using our all powerful 'brain optimizer', it's trivial for us to trace the appropriate code paths and come to the conclusion that in this particular case, nothing is accomplished by calling '[data self]'. If we can do it, then hypothetically a compiler could do it too.

I think what all of this is really trying to accomplish is a fundamental deficiency in treating __strong pointers as equivalents to generic void * pointers. What should really be done is change the visibility rules of __strong pointers such that 'they are guaranteed to be visible to the GC system from the point of declaration to the end of the enclosing block.' Machinations like sticking '[data self]' near the end so the pointer stays visible up to that point and the possible effects of optimization on such visibility become moot under such a definition.
Adding additional attributes to make any new API contracts stricter is an interesting idea, but it's likely to result in breakage of existing code either at runtime or during compilation.
-- Chris
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References:
	>Garbage collector vs variable lifetime (From: Quincey Morris <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Bill Bumgarner <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Ricky Sharp <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Bill Bumgarner <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Quincey Morris <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Chris Hanson <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: John Engelhart <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: Chris Hanson <email@hidden>)
	>Re: Garbage collector vs variable lifetime (From: John Engelhart <email@hidden>)

Prev by Date: Re: Garbage collector vs variable lifetime
Next by Date: Re: Using a Core Data relationship binded to an NSTokenField
Previous by thread: Re: Garbage collector vs variable lifetime
Next by thread: Re: Garbage collector vs variable lifetime
Index(es):
- Date
- Thread