Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- Subject: Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- From: John Engelhart <email@hidden>
- Date: Wed, 6 Feb 2008 04:39:48 -0500
On Feb 5, 2008, at 7:40 AM, Alastair Houghton wrote:
On 5 Feb 2008, at 00:14, John Engelhart wrote:
I had such high hopes for Cocoa's GC system because once you're
spoiled by GC, it's hard to go back.
Unfortunately, the 4-5 months of time I've put in on Leopard's GC
system has not been nearly as pleasant. It has been outright
frustrating, and it's reached the point where I consider the system
untenable.
Honestly, this point has now been answered over and over.
I think it comes down to the fact that you have failed to appreciate
that Cocoa GC is designed for easy use with *OBJECTS*. If you're
using it with objects, it "just works".
You misunderstand what Objective C is, and how it works. "Objects" is
synonymous for "Structs".
As far as objects "not being special", they *are* special, in that
the compiler generates layout information and method signatures for
them.
Read the above, "object" is synonymous for "struct". The "layout" of
an object is identical to the "layout" of a struct. This point is so
basic and fundamental to Objective-C and how things work at a low
level that it seriously brings in to question the accuracy of the rest
of your conclusions. If you do not understand the fundamentals such
as this, I do not see how you can possibly predict the effects and
implications of pointers in Leopards GC system.
AFAIK the layout information (albeit in a slightly different format)
is used by the garbage collector when scanning objects, which is
another reason that you need to use __strong on instance variables
if they point to non-object garbage collected memory.
Again, this clearly indicates that you do not understand the
fundamentals at hand. Your reasoning is faulty (in fact, it's
outright wrong). You have, in essence, made my point: How these
things work, and their subtle interactions, are CRITICAL to the
correct operation of the GC system. If you do not understand them,
you /can not/ possibly use the GC system correctly. You have,
LITERALLY, just signed yourself up to tracking down a GC related bug
in your code.
You should review the relevant files from the GCC compiler,
specifically gcc-5465/gcc/objc/objc-act.c from the 'gcc-5465.tar.gz'
distribution.
Thus spoke the documentation (documentation/Cocoa/Conceptual/
GarbageCollection/Articles/gcAPI.html):
__strong essentially modifies all levels of indirection of a pointer
to use write-barriers, except when the final indirection produces a
non-pointer l-value.
For example:
@interface GCTest : NSObject {
__strong void *ptr;
}
@implementation GCTest
- (void)setPtr:(void *)newPtr
{
ptr = newPtr;
}
__strong does not modify any layout information. At compile time,
when the compiler is working with a pointer that is qualified as
__strong, and the location that contains the pointer is written to /
updated / assigned (i.e., ptr = newPtr), the compiler re-write the
assignment to:
- (void)setPtr:(void *)newPtr
{
objc_assignIvar(newPtr, self, offsetof(ptr));
}
As for your example:
- (const char *)hexString
{
char *hexPtr = NULL;
asprintf(&hexPtr, "0x%8.8x", myIvar);
return(hexPtr);
}
[snip]
or if you absolutely must return a const char * pointer,
- (const char *)hexString
{
return [[NSString stringWithFormat:@"0x%8.8x", myIvar] UTF8String];
}
which has the benefit of working exactly like -UTF8String. If you
*really* wanted, you could mess about with NSAllocateCollectable().
Oh... my...
I could not possibly have asked for a better example of just how easy
it is to get this wrong.
Let those of you who are considering using Leopards GC system use this
as a warning of how dangerous its use is in practice.
First, understand that my original example, the one in which the GC
system snatched away the live allocation, did use
NSAllocateCollectable. Your example, using UTF8String, uses
NSAllocateCollectable as well. We can infer this by the behavior
exhibited by the GC system when qualifying pointers that store the
results from these methods as __strong, which prevents the collector
from reclaiming the allocation. Thus, by induction, the pointer to
the buffer that contains the strings text must ultimately come from
NSAllocateCollectable.
Let's start with NSAllocateCollectable(). The prototype for
NSAllocateCollectable, from NSZone.h, is as follows:
FOUNDATION_EXPORT void *__strong NSAllocateCollectable(NSUInteger
size, NSUInteger options) AVAILABLE_MAC_OS_X_VERSION_10_4_AND_LATER;
NOTE WELL: the __strong qualifier for the pointer.
Now, there is no formal grammar of Objective-C 2.0 published, but it
is a reasonable assumption that "__strong" is in the same group of
type qualifiers as "const" and "volatile". This makes __strong
subject to the same ANSI-C rules governing the use of type qualifiers,
including promotion and assignment rules. Returning to the method
definition of UTF8String, we find the following in NSString.h:
- (const char *)UTF8String; // Convenience to return null-terminated
UTF8 representation
Since the pointer that UTF8String returns is provably from
NSAllocateCollectable, this prototype has /DISCARDED/ the type
qualifier of __strong.
In (brief, simplified) summary, ANSI-C says that pointer assignments
from a "lesser qualified" type can be made to a "more qualified" type,
but not the other way around. For example:
char *cp;
const char *ccp;
It's perfectly legal to do the following:
ccp = cp;
The reverse, however, is not necessarily true:
cp = ccp;
This will result in a warning issued by the compiler that the type
qualification has been discarded. Now, modifying the example slightly
(for brevity, I'm going to gloss over the details of why moving const
past the pointer changes things):
char *cp;
char * const cpc;
The following results in an error:
cpc = cp;
Specifically: "error: assignment of read-only variable 'cpc'", and the
following is legal:
cp = cpc;
While there are certain qualifier specific semantics, I think we can
all reasonably agree that dropping the "__strong" qualifier should be
an error, as the consequences of discarding it is that the garbage
collector will loose it's ability to determine the liveness of that
pointer, resulting in difficult to find bugs.
QED, assigning a __strong qualifier to a non-__strong qualified
pointer is an error, per ANSI-C standard type qualifier promotion rules.
Naturally, one can override this behavior by typecasting the pointer
assignment, but by doing so, you, the programmer, have explicitly told
the compiler that the type qualification "does not apply in this
case.". This kind of type casting should only be done if you
understand all the consequences of the resulting type cast, and never
to just silence the compiler.
Referring back to my original example in my original post, in which I
store the pointer from a call to UTF8String to a "const char *title"
ivar pointer that the garbage collector later considers dead and
recycles, it is provable that the "__strong" qualifier most certainly
does apply to the pointer returned.
Therefore, by ANSI-C language rules, my assignment of the pointer
returned by UTF8String is legal, and the declaration of UTF8String
implicitly states that "the GC problem will be taken care of because
the programmer of this function has EXPLICITLY typecasted the __strong
qualifier away."
QED, my use of the UTF8String pointer is bug free and legal by ANSI-C
rules. That this later causes the GC system to reclaim the memory
pointed to by this pointer is due to a bug in prototype of UTF8String.
By type promotion rules, the prototype for UTF8String should be:
- (const char * __strong)UTF8String; // Convenience to return null-
terminated UTF8 representation
And, following ANSI-C rules, my assignment to "const char *title;"
would result in a compiler error by the "no stronger qualified to
lesser qualified" doctrine. This is exactly as it should be, because
as I have demonstrated, dropping that qualifier results in the GC
system reclaiming live memory.
Because of the design of Leopards GC system, it is the PROGRAMMERS
responsibility to INFER when a pointer should be __strong qualified.
Failure to correctly infer, apriori, which pointers require __strong
qualification is an intractable problem, and certainly should not be
left up to the programmer to "guess" correctly. The consequences of
getting this CRITICAL qualifier wrong will result in "race condition"
like problems. Programs will appear to operate correctly under light
load, but as they are pushed harder and hard, the conditions to expose
these race conditions are guaranteed to happen, resulting in nearly
impossible to find bugs and crashes.
In practice, as I have found, this results in programs that operate
problem free during development, but fail in unexplainable ways "in
the real world", with all the symptoms of race condition induced bugs.
And now I will show how Leopards GC system is, in fact, FUNDAMENTALLY
and fatally flawed.
Since the design of Leopards GC system has hoisted critical aspects of
it's functioning in to the compiler, and in turn the code emitted by
the compiler, this has the unfortunate effect of expanding the points
in which GC bugs can pop up. Contrast this to a dynamic shared
library: When a bug is fixed in a dynamic shared library, programs
using that shared library do not have to be recompiled to take
advantage of those bug fixes. Leopards GC system is akin to using
static libraries, if a bug is found in that library, every single
program that links to that library must be recompiled. The use of
static libraries is considered such poor practice that Sun no longer
supports their use, for obvious reasons.
As shown, proper application of ANSI-C type qualifier propagation
rules would have eliminated the storing of a __strong qualified
pointer to a non-__strong qualified pointer by refusing to compile the
program due to errors. Instead, the compiler has allowed the code to
compile, both warning and error free, even though the code will result
in obvious problems later on.
However, the problem does not lie just in the definition of the
UTF8String method, the problem is with the compiler itself. According
to ANSI-C standard, updating the definition of UTF8String to include
__strong should result in an error being generated when its result is
assigned to a "const char *" pointer declaration.
No such error is generated.
In fact, the compiler appears to be incapable of correctly following
ANSI-C type qualifier rules regarding __strong. For example:
---
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char *cp;
const char *ccp;
char * const cpc;
__strong char *scp;
char * __strong cps;
ccp = cp;
cp = ccp; // Line 12
cpc = cp; // Line 14
cp = cpc;
scp = cp;
cp = scp;
cps = cp;
cp = cps;
*cp = 'X';
*ccp = 'X'; // Line 24
*scp = 'X';
}
[johne@LAPTOP_10_5] /tmp% gcc -fobjc-gc-only -c test.m
test.m: In function 'main':
test.m:12: warning: assignment discards qualifiers from pointer target
type
test.m:14: error: assignment of read-only variable 'cpc'
test.m:24: error: assignment of read-only location
---
Type qualifier rules are correctly followed for the 'const' qualifier,
but not a single warning or error is given for what are clearly type
qualification errors according to ANSI-C rules with regard to __strong
qualified pointers. It therefore follows that the compiler, by not
throwing an error, is in fact generating buggy code since such
assignments are not legal.
Since, as demonstrated and should be obvious, improperly discarding
the __strong qualifier WILL result in the GC system reclaiming live
memory. In practice, this results in code that appears to run fine
during development, but due to the "race condition" like nature of
improper reclamation of live memory by the GC system, "real world"
code tends to be buggy and unstable, and crashes for mysterious reasons.
Since the compiler gives no warning, nor error, for discarding the
__strong qualifier when it should, the compiler is emitting code that
will cause GC errors. It's impossible to gauge how frequent this is
in practice, but I don't think anyone will disagree that the sheer
volume of code involved in Cocoa effectively guarantees that these
errors are present.
Once one accepts that these errors are present, one must implicitly
accept that when using Leopards GC system, it is just a matter of time
until the right conditions are present to cause accidental reclamation
and it's associated problems. Therefore, by using Leopards GC system,
you are guaranteeing that you will have random and difficult to find,
if not outright impossible, bugs.
QED, Leopards GC system is fundamentally and fatally flawed, and its
use should be actively discouraged. Due to the nature and severity of
improperly dropping the __strong qualifier, all code generated by the
current compiler (in GC mode) must be considered suspect, and
pragmatically, discarded.
While I'm sure you thought your retort was clever, you have in fact
underlined my point that Leopards GC system is, in fact, deceptively
difficult, hard to master, and trivially easy to get wrong.
Unfortunately, I was not very clear in my original message regarding
these problems. To those of you who have pointed out that, according
to the docs, it is an error on my part that I did not qualify my
pointer with __strong, please consider for a moment that I am writing
this after several months of using Leopards GC system. You are all
right. In theory, by following the docs, this all works.
This is not the point I was trying to make.
In theory, this all works. In practice, it does not, and it's easy to
get wrong. That is my point.
As I believe I have shown in this message, the design of Leopards GC
system is such that it essentially guarantees you're going to get it
wrong at some point. It gleefully allows you to compile code which
will create problems within your application, and does so without
warning or errors. The fact that the compiler is violating ANSI-C
rules in discarding the __strong qualifier guarantees that some code,
some where, is going to get it wrong, and compile it anyway.
If one were to speculate as to what the behavior of the end result of
all of this is, one would figure that "things will mostly work, except
on the rare occasion when they don't". I'm here to tell you that
after four months of this, the "rare occasion" is much more frequent
than you think.
The consequence of such tight integration with the compiler for GC
support dooms code to the same problems that haunt static library
linked code. There will be no bug fixes for your compiled code. No
later version of anything will correct the problems forever frozen in
your binary. As stated, some vendors no longer support linking to
static libraries because of problems like this, for obvious reasons.
I'm a huge fan of GC. Those of you who have used the Boehm GC system
know how easy it is to get spoiled by GC in C and say "Never again!"
to manual memory management. Those of you reading this will have to
draw your own conclusions as to the validity of my claims. My
observations are not the result of some theoretical speculation from
glancing through the GC docs, it's rooted in attempting to use it in
the real world, on non-toy real problems, and I'm sharing with you the
pain I've experienced.
Those of you who've read the examples and think "There's no way I will
ever slip up and miss a __strong qualification," then you're good to
go. Anyone else who thinks "Well, I might, but rarely.." should
understand the full ramifications of what happens when you slip up.
This class of bugs is orders of magnitude more difficult to find and
fix than any retain/release bug you've ever had to deal with. In
fact, I'll go so far to say that multi-threaded locking heisenbugs are
easier, at least then you've got a pretty good idea of where the bug
originates and can concentrate of finding the rare corner case that
triggers it.
As non-trivial, real world examples, consider the following:
(From "Garbage Collection Programming Guide", Core Foundation section:)
o NULL, kCFAllocatorDefault, and kCFAllocatorSystemDefault specify
allocation from the garbage collection zone.
By default, all Core Foundation objects are allocated in the garbage
collection zone.
- (NSString *)someMethod
{
NSUInteger finalStringLength = 1024; // Example only
NSString *copySting = NULL;
char * __strong restrict copyBuffer = NULL;
copyBuffer = NSAllocateCollectable(finalStringLength, 0);
/* Since this is just an example, the part that fills contents of
copyBuffer with text are omitted */
copyString =
NSMakeCollectable
((id)CFStringCreateWithCStringNoCopy(kCFAllocatorDefault, copyBuffer,
kCFStringEncodingUTF8, kCFAllocatorNull));
/* kCFAllocatorNull = This allocator is useful as the
bytesDeallocator in CFData or contentsDeallocator in CFString where
the memory should not be freed. So.. Don't call free() on our
NSAllocateCollectable buffer, which is an error. */
return(copyString);
}
You see where the bug is, right? (Those wondering 'Why CFString.. ?',
it's much faster, and no dispatch overhead. You get the same effect
with initWithBytesNoCopy:length:encoding:freeWhenDone:)
How about this:
- (id *)copyOfObjectArray:(id *)originalObjects length:
(NSUInteger)length
{
id *newObjectArray = NULL;
newObjectArray = NSAllocateCollectable(sizeof(id) * length,
NSScannedOption);
memcpy(newObjectArray, originalObjects, sizeof(id) * length);
return(newObjectArray);
}
Does this contain a bug? And if so, where in "Garbage Collection
Programming Guide" or "NSGarbageCollector Class Reference" does it
indicate that this is a bug?
The "Garbage Collection Programming Guide" and "NSGarbageCollector
Class Reference" documentation say this is "no problem", or at least
don't say that you shouldn't do this? Wonder why it's crashing
randomly then. The allocation has the NSScannedOption, so the garbage
collector is obviously scanning the memory.. and it's dealing with
objects... which are 'special'.. so?
Hint: This is buggy as hell, which should be intuitively obvious to
anyone who's read the GC documentation listed above.
Actually, this is a pretty good test I think. If, after reading the
code snippet and two docs above on the GC system, you can't spot the
bug, you probably shouldn't be using Leopards GC system, cause I
guarantee you this is just one of many land mines just waiting for you
to discover.
If you really want to know the answer: nm -mg /usr/lib/libobjc.dylib |
grep mem (The 'obvious' in the above hint is satirical, obviously)
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden