Re: Garbage Collection woes...
Re: Garbage Collection woes...
- Subject: Re: Garbage Collection woes...
- From: "Stephen J. Butler" <email@hidden>
- Date: Sun, 29 Jun 2008 01:13:05 -0500
On Fri, Jun 27, 2008 at 2:31 PM, John Engelhart
<email@hidden> wrote:
> Lesson #2: Since there is so little documentation about the GC system, this
> involves a lot of speculation, but I think it summarizes what's really going
> on. This all started with an effort to keep a __weak reference to a passed
> in string that was used to initialize an element in a cache. When the cache
> was checked, if that weak reference was NULL, then the cache line is invalid
> and should be cleared. The cache consisted of a global array of elements,
> selection was done via KEY_STRING_HASH % CACHE_SIZE, and everything was
> under a mutex lock. An approximation of the cache is:
>
> typedef struct {
> NSString *aString;
> __weak NSString *aWeakString;
> NSInteger anInteger;
> } MYStructType;
>
> MYStructType globalStructTypeArray[42]; // <-- Global!
>
> Simple, right? That's how it always starts out... The first problem
> encountered was:
>
> [johne@LAPTOP_10_5] /tmp% gcc -o Global_GC Global_GC.m -framework Foundation
> -fobjc-gc
> Global_GC.m:14: warning: __weak attribute cannot be specified on a field
> declaration
>
> (The attached file contains the full example demonstrating the problem.)
>
> I'm not really sure what this means, and I don't recall reading anything in
> the documentation that would suggest anything is amiss. I never actually
> managed to figure out what, if any, problem this causes because it quickly
> became apparent that there was a much bigger problem that needed dealing
> with:
Speculation: __weak needs a read-barrier as well as a write-barrier,
and with structs people have a long history of reading them without
going through the accessor. This isn't generally a problem for
__strong and write barriers because for all of this to work you need
to make sure that the memory for MYStructType is GC scanned anyway.
> The pointer to 'aString' in the above (or any of my other __strong pointers
> in my actual code) were clearly not being treated as __strong, and the GC
> system was reclaiming them causing all sorts of fun and random crashes.
>
> The documentation states: The initial root set of objects is comprised of
> global variables, stack variables, and objects with external references.
> These objects are never considered as garbage.
This is kind of a lie since not ALL global memory is treated as
collectable. Hence the need for special assigns.
> Putting the pieces together, it became obvious what was really going on.
> The two commented out lines in the example that update the global variable
> are the key to the mystery and make everything work as expected.
>
> It turns out that when the documentation says that "root set of objects is
> comprised of global variables", it's true, but probably not in the way that
> you think it is.
>
> It would 'seem' that global variables are only __strong when the compiler
> can reason that you're referring to a global variable directly. In this
> particular case, that would be:
>
> globalStructTypeArray[23].aString = newString;
Speculation: another way to think of it is that not all global memory
is considered a collectable root until you've first used it. That is,
on the first call to objc_assign_global, the pointer is added to the
list of collectable roots. It appears to be a lazy sort of system.
> They are not strong when you refer to them indirectly (even though write
> barriers are clearly being performed), such as:
>
> update(&globalStructTypeArray[23], newString);
>
> update(MYStructType *aStructType, NSString *string) {
> aStructType->aString = string;
> }
>
> Looking at the assembly output, the reason becomes clear:
>
> The write barrier used by the first, direct reference is objc_assign_global,
> while the write barrier used by the indirect reference in update is
> objc_assign_strongCast.
>
> This is probably an important point that you should consider if you're
> depending on global variables being truly __strong. No doubt someone here
> will explain that this isn't a bug, it's just that you shouldn't reference a
> global variable via a pointer (this is sarcastic for the challenged).
You shouldn't reference a global variable via a pointer! Kidding.
The problem is essentially the same as the one in this code:
class Foo {
public:
NSString* fieldA;
int fieldB;
Foo( NSString *_fieldA, int _fieldB ) : fieldA( _fieldA ), fieldB(
_fieldB ) {}
};
Foo *f = new Foo( @"Something strong", 42 );
IIRC, you'll also find that here f->fieldA is collected way before you
expect. Only this time, there's plenty of emails about how to fix it.
The problem is that ::new returns a block of non-GC memory. So even
though the write barriers are setup properly, f->fieldA is in a
non-scanned region. See here:
http://lists.apple.com/archives/Cocoa-dev/2008/Feb/msg00435.html
In your case, globalStructTypeArray is also in a non-scanned region,
which is why the compiler uses the special _global assign. But you've
hidden the global nature from the compiler by using the pointer, so it
fails.
> I'll leave you to ponder the implications of the above. The next nut to
> crack after that one is: __weak pointers must be read via a wrapper
> function (objc_read_weak), and you can't tell if the pointer passed in is
> actually a __weak reference to, say, a NSString, then do you have to assume
> the worst that every pointer passed in may potentially be __weak and
> therefore for safety must be wrapped in a call to objc_read_weak()? Talk
> amongst yourselves.
>
> Since I can't arrange for my code to always use the GC variable directly,
> and I don't have an answer wrt/ to the "always assume __weak" question, I've
> pretty much abandoned GC for this particular use.
Maybe I misunderstand. If you have this code:
void foo() {
__weak NSString *aResult = nil;
aResult = getNextResult();
bar( aResult );
}
void bar( NSString *bResult ) {
doSomething( bResult );
}
ISTM that when you call bar(), the pointer passed in now has a strong
reference to it (bResult). bar() shouldn't care if its argument
originally came from a weakly held pointer; the call to
objc_read_weak() is made when forming the argument stack, not later
inside bar(). I don't know this for sure, but it would be insane to
have it work any other way.
GC is kind of cool, but I think in C it's more hassle than it's worth.
All these problems arise because some of the memory is GC and some
isn't. In Java and C# (IIRC) it's all GC. Does any language do this
half is/isn't well?
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden