Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- Subject: Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- From: John Engelhart <email@hidden>
- Date: Mon, 4 Feb 2008 19:14:20 -0500
On Feb 4, 2008, at 8:11 AM, Alastair Houghton wrote:
or the excellent book "Garbage Collection: Algorithms for Automatic
Dynamic Memory Management" by Jones and Lins (Wiley 1997, ISBN
0-471-94148-4 <http://www.amazon.co.uk/dp/0471941484>).
You must have a later edition than mine, as the inside cover of my
copy says '96.
The GC docs actually explain what the write barrier is used for here:
<http://developer.apple.com/documentation/Cocoa/Conceptual/GarbageCollection/Articles/gcArchitecture.html#//apple_ref/doc/uid/TP40002451-SW4
>
It makes no particular demands of the programmer or compiler, in
fact it can be used as a drop in replacement for malloc() and
free(), requiring no changes.
From what I've pieced together, Leopards GC system is nothing like
this. While the Boehm GC system detects liveness passively by
scanning memory and looking for and tracing pointers, Leopards GC
system does no scanning and requires /active/ notification of
changes to the heap. This, I believe, is what a 'write-barrier'
actually is: it is a function call to the GC system so that it can
update it's internal state as to what memory is live. It relies, I
suspect exclusively, on these function calls to track memory
allocations.
The Boehm GC and the Leopard Cocoa GC have very different design
goals. In the case of Boehm's collector, it's a requirement that
the collector work without any assistance from the compiler; as a
result, it has to use "conservative" techniques, which may in
general result in leaks of arbitrary amounts of memory simply
because of a stray value that *looks like* a pointer to something.
The lack of compiler assistance means that it's almost impossible to
write a collector that will run in the background (the Boehm
collector has to stop *all* the other threads in your program every
so often if you run it in the background), and it's difficult to
implement generational behaviour without relying on platform-
specific features such as access to dirty bits from the system page
table... Even in that case, use of dirty bits is woefully
inefficient compared to compiler co-operation, since a single dirty
bit means you must re-scan an entire page of memory. The Boehm GC
is very clever, certainly, but it has to cope with these limitations
(and more besides).
I've always enjoyed using the Boehm garbage collector. I've never had
a problem with it's speed, and off the top of my head I can't think of
any issue where I had to work around the collector. It always just
'works', and not only works but has caught several pointer misuse or
off by one errors as well. It's a joy to use, you literally just
allocate and forget. I had such high hopes for Cocoa's GC system
because once you're spoiled by GC, it's hard to go back.
Unfortunately, the 4-5 months of time I've put in on Leopard's GC
system has not been nearly as pleasant. It has been outright
frustrating, and it's reached the point where I consider the system
untenable.
Cocoa GC, on the other hand, is able to co-operate with the
compiler, and that's what the write barriers are. You have mis-
interpreted their function; they exist to track inter-generational
pointers, not to enable some sort of behind-the-scenes reference
counting as I think you imply. They may also be used to help the
collector to obtain a consistent view of the mutator's objects in
spite of running in the background... I don't know whether the
Leopard GC does that or not.
As I've stated, my opinions are formed from the publicly available
documentation and my (hair pulling) experiences over the last few
months. The quick and the short of it is Leopards GC system behaves
unlike any other GC system I've used.
(Incidentally, there is also a read barrier, which is used to help
implement zeroing weak references; the compiler only generates that
for variables marked __weak.)
I think, perhaps, that it would be worth your while reading through
the literature on garbage collection, as you might then understand
the various trade-offs involved better.
In order for leopards GC system to function properly, the compiler
must be aware of all pointers that have been allocated by the GC
system so that it can wrap all uses of the pointer with the
appropriate GC notification functions (objc_assign*).
Yep.
[snip]
Realistically, to properly add __strong to a pointer, you need to
know if that allocation came from the garbage collector. This
information is essentially impossible to know apriori, so the only
practical course of action is to defensively qualify all pointers
as __strong.
No. Cocoa GC mostly deals with objects (which may include Core
Foundation objects). That's why the default assumption, which is
that object pointers are strong, is enough for most situations.
There is nothing special about objects. I believe that this doesn't
quite hold true for the ObjC 2.0 64 bit API, but it still holds true
for the 32 bit API: Objects are nothing but pointers to structs.
Your typical
@interface MYObject : NSObject {
void *ptr;
}
essentially becomes:
typedef struct {
#include "NSObject_struct_bits";
void *ptr;
} MYObject;
The following is a working example of the key points of objective-c,
and for all practical purposes, this is what your objective-c object
gets turned in to. It's literally possible to hack up a small perl
script that gets you 60-70% of the way to a full blown "Objective-C
Compiler":
#include <stdio.h>
#include <stdlib.h>
typedef struct { const char *title; } MYObject;
void *alloc (MYObject *self, const char
*_cmd) { return(calloc(1, sizeof(MYObject))); }
void *init (MYObject *self, const char
*_cmd) { return(self); }
void setTitle(MYObject *self, const char *_cmd, const char
*newTitle) { self->title = newTitle; }
const char *title (MYObject *self, const char
*_cmd) { return(self->title); }
int main(int argc, char *argv[]) {
MYObject *testObject = NULL;
testObject = init(alloc(NULL, "alloc"), "init");
setTitle(testObject, "setTitle:", "Object Title");
const char *theTitle = title(testObject, "title");
printf("Object: %p title: %p, '%s'\n", testObject, theTitle,
theTitle);
return(0);
}
[johne@LAPTOP_10_5] /tmp% gcc -o obj obj.c
[johne@LAPTOP_10_5] /tmp% ./obj
Object: 0x100120 title: 0x1fcc, 'Object Title'
You can even have the compiler create your objects ivars with the
@defs directive. In fact, it's possible and perfectly legal to create
a C function inside your @implementation with a prototype like
myCFunction(MYObject *self, SEL _cmd) and access your objects ivar's
with "self->ivar" inside the function.
That only changes if you have pointers of non-object types that
happen to point to things that were allocated with the GC, *and only
then* if they are stored in locations that are not scanned by
default. This is an unusual situation, since few methods return
things that are allocated by GC and that are not objects. -
UTF8String is probably the most common example, but since you tend
not to store the result of that method, there would rarely---if
ever---be a problem.
As the above example illustrates, the entire issue of "object" vs.
"non-object" is a red-herring. There is nothing special about
objects, nor anything special about ivars. The fact that the GCC
compiler attempts to 'automagically' detect which pointers are
__strong behind your back only obscures the issues at hand. Because
of this automatic promotion, it's easy to fall in to a trap where
objects are some how magical. Nothing could be further from the truth.
The fact of the matter is that the DEFAULT behavior for pointers in
Leopards GC system is that they are ignored and do not point to live
data. I challenge anyone to find another GC system in which the
default behavior for a pointer is to be ignored, and what it points to
to NOT be considered part of the live set. However, through some
unspecified logic, SOME pointers are elevated to 'Points to live GC
data'.
I mean, seriously, can anyone conjure up a compelling reason why the
default behavior of a pointer is that it does not point to live data?
I can think of some infrequent special cases when I would want to turn
it off, but off by default unless you qualify it with __strong?
The consequence of using a pointer that is not properly qualified
as __strong is that the GC system may determine that the allocation
is no longer live and reclaim it, even if there is still a valid
pointer out there.
Only if there is no copy of the pointer in any of the locations that
are scanned by default (e.g. the stack, in registers, in global
variables).
I'd put instantiated objects on that list.
It is also trivial to get wrong, and the only indication that
there's a problem is an occasional random error or crash.
In most cases, because GC'd things are objects, it's trivial to get
*right*.
It's only in special cases, where you're using C pointer types to
point to GC'd memory, that you need worry about this kind of thing.
Your choice of words makes me suspect that you consider an object
pointer, such as NSObject *, and a C pointer, ala void * or char *, to
be two distinctly different things. I believe I have shown that this
is not the case, and one can, in fact, consider them all to be 'void
*' pointers for the purposes of reasoning.
When considered from the 'void *' perspective, I believe your argument
highlights my point: Pointers are pointers, and knowing which ones to
treat as 'special' is non-trivial and easy to get wrong.
I believe I have a succinct example that illustrates these issues:
[snip]
I strongly suspect the pointer that UTF8String returns is a pointer
to an allocation from the garbage collector. In fact, by changing
the 'title' ivar to include __strong 'solves' the problem.
Yes, that's your bug. It doesn't just 'solve' the problem, the lack
of __strong here *is* the problem, but only because this is an ivar
and not e.g. a function argument or a stack-based variable.
I don't disagree with you, 'technically' it's my bug, but this is my
point. Take a step back for a second and consider what you're saying:
The garbage collector has reclaimed the allocation that contains the
text for the string. From an object that the GC system considers to
be live. That contains a pointer, that isn't 'hidden' by xor or what
not from the GC system, it's a normal pointer to the allocation. That
the garbage collector just recycled because there are no references to
keep it live.
Can you name another garbage collector in which this is a /
programmers/ error, and not a bug in the GC system? Reading over this
I almost have to chuckle at the absurdity of it. Yet when you get
right down to it, this is what is being advocated.
Now, trying to find 'bugs' such as this in running code is every
programmers worst nightmare. The bug manifests itself only when the
GC system reclaims it, which is essentially at some completely random,
non-deterministic point in time in the future. There is essentially
nothing you can do to reproduce the bug.
Then take a look again at the method prototype:
- (const char *)UTF8String; // Convenience to return null-terminated
UTF8 representation
Can you clearly and concisely articulate why the pointer returned from
this particular method requires a __strong qualifier? Remember, if
you get it wrong you sign yourself up for many long nights of trying
to track down some random, non repeatable bug.
Then consider the following:
- (const char *)hexString;
- (const char *)hexString
{
char *hexPtr = NULL;
asprintf(&hexPtr, "0x%8.8x", myIvar);
return(hexPtr);
}
Now what? And whatever you do, DON'T cross the streams, or you risk
total protonic reversal.
But this points to a much bigger problem: anyone who has used
UTF8String and not qualified it as __strong has a race condition
just waiting to happen.
No, because stack variables and registers are included in the set of
GC roots.
Oddly, I don't find this reassuring. In fact, I think it might make
things worse. Why? This implies that the collector considers
anything that looks like a pointer on the stack is a pointer, and it
should be considered live and followed. If this is the case, this has
the effect of automatically promoting all pointers on the stack to
__strong, and that's a problem. This masks pointer declaration
errors, and pointers that are missing __strong will work as a side
effect of this behavior instead of causing crashes.
Besides which, not everything lives on the stack.
This is but one example. I don't think I need to point out that
there are others. A lot of others. And most of them are non-
obvious. A consequence of all of this is that you must not pass
pointers that may have been allocated by the garbage collector to
any C function in a library. For example,
printf("String: %s\n", [@"Hello, world!" UTF8String]);
That code is fine. The reference is on the stack (or, before that,
in the register that holds the return value of -UTF8String). It
will be followed, so the memory won't be released until the printf()
function has finished with it.
Again, you are correct in the sense that this is how the 10.5 GC
system works.
I still contend that this is a design flaw. If the pointer were
passed any other way, let's say via some mutex guarded inter-thread
queue, it would require a __strong qualification. This works due to a
side effect of the GC system promoting ALL pointers it sees on the
stack to __strong. It's another "exception to the rule" to keep track
of.
This is why I believe Leopards GC system is fundamentally flawed.
There are a lot of these little rules and one offs you need to keep
track of, and hope that whoever wrote the code you're calling got it
all right too. The UTF8String highlights just how easy it is to
forget to add a __strong qualifier in front of a pointer, and that
most of the time things will work just fine. Until you hit that odd
ball corner case, and then at some point long after the initial event
that caused the problem has passed, things crash.
Those of you out there that think that these are non-issues, or "might
happen rarely"... please, knock yourself out and have fun. You too
can add "Reflexively knows the default address for the stack of the
first four threads and can unwind said stack frames by hand!" to your
resume. I'm just saying that I've been down this road and I'll gladly
take tracking down multithreaded deadlocks and hunting that last
missing release over this any day.
passes a GC allocated pointer to a C library function, which almost
assuredly does not have the proper write barrier logic in place to
properly guard the pointer.
The write barrier is nothing to do with it. The write barrier is
for inter-generational pointers, and possibly also to help the
collector to scan in the background safely.
Well, you see... You'd think that, wouldn't you? But take a careful
look at the example I posted. Now, this is just a simple executable
built in the shell for example purposes, so none of the AppKit stuff
is fired up. The GC docs also say that the GC system is demand driven
under these conditions and that AppKit kicks the GC system to spawn a
background collector thread (objc_startCollectorThread())... so, I
think it's safe to say that things are 'quite' and nothing fancy is
going on in the background.... and there's only a few lines, so we can
be reasonably sure there's no background, hidden mutation events that
a write barrier would normally catch... but follow the steps closely
(from the original example posted)
[gcConstTitle setTitle:"Hello, world!"];
[gcUTF8Title setTitle:[[NSString stringWithUTF8String:"Hello, world
\xC2\xA1"] UTF8String]];
[[NSGarbageCollector defaultCollector] collectExhaustively];
NSLog(@"GC test");
printf("gcConstTitle title: %p = '%s'\n", [gcConstTitle title],
[gcConstTitle title]);
printf("gcUTF8Title title: %p = '%s'\n", [gcUTF8Title title],
[gcUTF8Title title]);
return(0);
}
[johne@LAPTOP_10_5] GC% gcc -framework Foundation -fobjc-gc-only gc.m -
o gc
[johne@LAPTOP_10_5] GC% ./gc
Setting title. Old title: 0x0, new title 0x1ed4 = 'Hello, world!'
Setting title. Old title: 0x0, new title 0x1011860 = 'Hello, world¡'
2008-02-03 19:07:58.911 gc[6191:807] GC test
gcConstTitle title: 0x1ed4 = 'Hello, world!'
gcUTF8Title title: 0x1011860 = '??0?" '
[johne@LAPTOP_10_5] GC%
We can be reasonably sure that the pointer to the UTF8String is
'visible' before we call the collector, and there's no race conditions
happening. There's no mutations that a write barrier needs to
intercept going on. The gcUTF8Title object is clearly still 'live'
according to the GC system. The object clearly has the same pointer
in its ivar before and after the collection, yet the GC system
reclaimed the allocation that contained the string.
This has got to be the only GC system in which an object is live and
traceable from the roots, contains a pointer (that's not hidden or
anything else fancy) to an allocation that contains the text of the
string, and the GC system considers the string buffer to be dead, and
it's the fault of the programmer. Because the default behavior for
pointers is not that they point to live data, but that they are
ignored and not considered when tracing the heap. Pointers don't
point to things that are needed that often, do they?_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden