Re: Garbage collector vs variable lifetime
Re: Garbage collector vs variable lifetime
- Subject: Re: Garbage collector vs variable lifetime
- From: "Hamish Allan" <email@hidden>
- Date: Sat, 7 Jun 2008 23:24:22 +0100
On Sat, Jun 7, 2008 at 7:34 PM, Peter Duniho <email@hidden> wrote:
>> Date: Sat, 7 Jun 2008 15:31:43 +0100
>> From: "Hamish Allan" <email@hidden>
>>
>> We're just interpreting this promise differently. To my mind, while
>> the variable is still in scope, I *do* have a pointer to the object.
>
> All due respect, "to my mind" isn't a valid technical specification.
Whenever you write documentation in a natural language, there is scope
for ambiguity. This particular technical specification is only
mentioned in a single sentence:
"The root set is comprised of all objects reachable from root objects
and all possible references found by examining the call stacks of
every Cocoa thread."
Are you seriously telling me that "all *possible* references found by
examining the calls stacks" unambiguously means "all *actual*
references found by examining the rest of the code in the block"?
>> It is available for my use until the end of that scope. The fact that
>> I don't actually make use of it, and the optimising compiler thinks
>> I'm done with it: *that*'s an implementation detail.
>
> No, it's not. It's a _semantic_ detail. If you do in fact not use the
> reference, then it is in fact _unused_. Garbage collection is all about
> what's being used or not.
You'll notice that I was replying to a specific point Michael made:
"the promise that is being made is that objects which you're still
pointing to won't go away". I don't disagree that "have a pointer"
means something different to "use a pointer". Which part of the
documentation unambiguously specifies that "Garbage collection is all
about what's being used or not"?
> Even if the compiler did not optimize away the
> use of the variable past that point in the code, if the GC system could
> otherwise determine you weren't ever going to use it again, it would _still_
> be valid for the GC system to collect the object.
If you could write a garbage collector that could reliably detect what
was being used, you'd have no need to specify __strong or __weak any
more, and you'd have solved the halting problem to boot. We're not
quite at that stage yet :)
>> I agree with you that if it returned collected memory, it would be
>> more powerful than under retain/release/autorelease. But power and
>> straightforwardness do not always go hand in hand ;)
>
> Well, in this case I believe they do. The real issue here is that a garbage
> collection system is trying to co-exist peacefully with a non-GC system.
I disagree. The real issue here is that code optimisations cause
different behaviour in the GC, whereas if the GC behaved according to
rules based on the *semantics* of what the programmer writes (i.e.
what would happen if that variable were really on the stack, rather
than optimised away into a register), there wouldn't be a problem (the
optimisation could still happen, of course, but the compiler would
flag that reference as strong).
> GC
> systems have been around for a long time, but in many of the examples, the
> entire environment is garbage-collected and this sort of thing just doesn't
> come up in normal code.
Sure. I'm not trying to say I could have designed a perfect GC system
for Objective-C, I'm just trying to point out how I would change this
one for the better.
> That said, even in .NET (for example), the GC system has a "GC.KeepAlive()"
> method for this very purpose. It doesn't actually do anything, but it
> interrupts optimizations the compiler might make that would otherwise allow
> an object to be collected (it would be used exactly as the proposed "[data
> self]" call would be). This is to allow for situations where the only
> reference to the object is in "unmanaged" code -- that is, it's been passed
> across the interface from .NET to the older non-GC'ed Windows API.
Again, I think that signalling to the GC that you don't want the
object collected by creating a stack variable reference to it --
whether or not that variable ever actually ends up on the stack due to
optimisations -- is quite enough. No need for GC.KeepAlive(), [data
self], CFRetain(), disableCollectionForPointer, or any other hack.
> [snip]
>
> Not only is it not a bug for the compiler to not concern itself with the
> issue, the fact is that the extant example here is just the tip of the
> iceberg. While you might argue that the compiler _could_ prevent this
> particular problem, a) it requires the compiler to get into the business of
> memory management
The compiler is already in that business -- hence the modifiers
__strong and __weak.
> and b) there will still be lots of other scenarios that
> are similar, but not solvable by the compiler.
Could you please elaborate on this?
>> Given that the programmer cannot possibly know the state of the call
>> stack any other way than by considering normal variable scoping rules,
>> either it's meaningless to consider any variable on the stack as a
>> root object, or the compiler needs to guarantee equivalence between
>> them.
>
> No. As Mike points out, this is only a problem because the object is
> returning a pointer that the object itself can make invalid due to garbage
> collection, even while that pointer isn't itself subject to the GC rules.
> This is a design problem, not a compiler error.
Sure, you could design NSData differently to mask a design problem in
GC. But GC won't be easier to use than retain/release/autorelease
without simple rules like "if you declare it on the stack, it's in the
root set, regardless of whether the compiler optimises it into a
register".
> There are a variety of solutions, none of them spectacular. It would be
> much better for all GC objects to themselves own only GC objects, and/or to
> never return a non-GC pointer that it might later make invalid (in other
> words, if it does pass back a non-GC object, management of that object
> should be handed off to the recipient). But inasmuch as this cannot be
> guaranteed 100%, there will always be a need for the programmer to be aware
> that GC-able objects are only guaranteed to live up to the very last point
> they are used.
Consider the following:
if ([data length] > 2)
{
char *cs = (char *)[data bytes];
char c = *cs; // statement 1
void *p = (void *)data; // statement 2
NSLog("First character of data object at %p is %c\n", p, c);
}
Now, who is to say that it wouldn't suit the optimising compiler to
switch round statements 1 and 2, which it can plainly see have no
dependency on one another? So referencing "data" to ensure it remains
alive after I reference its inner pointers doesn't necessarily work.
I agree with you that there are a variety of solutions. I'm just
proposing one that I think makes memory management more
straightforward for the programmer than any others I've heard so far.
If you have any specific objections to it, I'd like to hear them.
> And it seems to me that with GC being a relatively new addition to Obj-C,
> that the likelihood of running into such situations is going to be greater
> than in a more-evolved environment. It's just something that needs to be
> kept in mind, just as in the older retain/release paradigm there were a
> number of rules that needed to be kept in mind. IMHO, inasmuch as the need
> to keep this particular issue in mind comes up less frequently than the
> retain/release rules needed to, the GC system is more accommodating (though
> I suppose it also means it's harder to get used to keeping the rule in mind
> :) ).
I think it also means it'll be harder to track certain bugs down. And
unnecessarily so!
Hamish
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden