Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- Subject: Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
- From: John Engelhart <email@hidden>
- Date: Wed, 6 Feb 2008 09:59:43 -0500
On Feb 4, 2008, at 8:21 PM, Chris Hanson wrote:
On Feb 4, 2008, at 4:14 PM, John Engelhart wrote:
However, through some unspecified logic, SOME pointers are elevated
to 'Points to live GC data'.
The logic isn't unspecified.
If a variable is of an object type or is of a pointer type with the
qualifier "__strong", it refers to something allocated -- and
cleaned up -- by the collector. Otherwise, it refers to something
not allocated by the collector.
Exactly. So, in my example, since the buffer is allocated by the
collector (as demonstrated by the collector not reclaiming the
allocation when it is qualified as __strong), and since we can be
reasonably sure that allocation came from NSAllocateCollectable, which
by its prototype returns void * __strong, my example is 100% bug free.
Unless someone has explicitly typecasted the __strong qualifier away
some how, but that would be a bug on their part because, as you've
said, __strong refers to something that's allocated by the collector,
which the pointer from UTF8String clearly is.
Thankfully these rules are super simple and virtually impossible to
get wrong.
So, as you've said, __strong refers to allocations from the collector,
pointers that can be traced back to the collector have a __strong
qualifier, which no one will casually discard because it will lead to
the collector loosing visibility and creating difficult, hard to find
bugs, my const char * pointer assignment is promoted to strong, or by
ANSI-C type qualification rules causes a compiler error for improperly
discarding the __strong qualifier on assignment, because the
NSAllocateCollectable collector allocation that is returned by
UTF8String is marked __strong, as its method prototype clearly shows:
- (const char *)UTF8String; // Convenience to return null-terminated
UTF8 representation
... Oh... Well, at least the bug isn't mine, the bug is in the
prototype for UTF8String, which has erroneously dropped the __strong
qualifier to a NSAllocateCollectable collector allocation.
Do I file a bug at this point? Do I also file a bug with everyone
who's used the compiler in GC mode and called this method and let them
know that due to a buggy declaration in a header, they may experience
issues with the pointer returned by UTF8String that may lead to data
corruption because the compiler will not emit the required write
barriers required for proper operation of allocations from the
collector?
I'm really glad this is just a hypothetical problem and doesn't happen
in practice, like this UTF8String example demonstrates.
I mean, seriously, can anyone conjure up a compelling reason why
the default behavior of a pointer is that it does not point to live
data?
Yes. Distinguishing between pointers to collector-allocated objects
and non-collector-allocated objects ensures that the collector has
far less work to do and can do the work it has more efficiently,
because it can have more exact information about what portions of
memory it needs to check for strong references to objects.
While you are undeniably right, your justification specious.
Like security, the proper way to analyze the problem is not from the
perspective of "if everything goes right", but "what are the
consequences on failure."
From this perspective, it becomes a question of "What are the
consequences when someone forgets to add __strong? How often do
programers make mistakes? How likely is someone to make this
mistake? Are there robust, automated checks in place to make sure
this doesn't happen?"
The consequences of inadvertently forgetting to add a __strong
qualifier when you should are likely to result in random data loss and
mysterious crashes, all of which are nearly impossible to trace back
to the root cause due to the fact that the problem occurs at some
later point in time, far from the source, due to the nature of how the
collector works.
Do you see how ridiculous your justification is when stated from this
perspective? Can you think of a group of programmers who will take
"Fast, but buggy and unstable, and impossible to debug" over "Slow,
but rock solid"?
Personally, I don't care how f'ing fast the thing is if it essentially
guarantees instability. I've got better things to do with my time
then spend days trying to find the cause of some random, non-
repeatable crash due to some allocation problem. Ironically, the very
thing GC is supposed to prevent.
It also goes a long way towards preventing false roots, which can
help keep down the working set of an application that uses garbage
collection.
Your choice of words makes me suspect that you consider an object
pointer, such as NSObject *, and a C pointer, ala void * or char *,
to be two distinctly different things.
They can be; when running under GC, an object is assumed to be
allocated by the collector. An arbitrary buffer is not. If you
want to use an arbitrary "non-object pointer" type variable to refer
to something that is allocated by the collector (e.g. a buffer
returned from NSAllocateCollectable) you need to mark that variable
with the __strong type qualifier so it gets the same treatment as an
object type variable.
By ANSI-C type qualification rules, anything that's returning a
pointer from NSAllocateCollectable must be returning that pointer with
the __strong qualification. Considering the consequences of dropping
that qualification is likely to result in buggy, unstable code,
overriding that qualifier by typecasting it out should not be done
lightly. There's a reason why '-Wcast-qual' exists.
Pointers are pointers, and knowing which ones to treat as 'special'
is non-trivial and easy to get wrong.
In Objective-C it is relatively straightforward to tell which
pointer type variables to treat as objects. You have pointers to
objects which you can send arbitrary messages to, and pointers to
things that aren't objects which you can't send arbitrary messages
to. The compiler itself has to be able to tell the difference
between them to generate correct code, warnings, and errors.
This is flat out wrong.
[johne@LAPTOP_10_5] /tmp% cat gc_str.m
#import <Foundation/Foundation.h>
int main(int argc, char *argv[]) {
NSString *aString = NULL;
void *ptr = NULL;
aString = [NSString stringWithString:@"Hello, world"];
ptr = aString;
NSLog(@"ptr '%@', description '%@'", ptr, [ptr description]);
}
[johne@LAPTOP_10_5] /tmp% gcc -framework Foundation -fobjc-gc-only -o
gc_str -g gc_str.m
gc_str.m: In function 'main':
gc_str.m:10: warning: invalid receiver type 'void *'
[johne@LAPTOP_10_5] /tmp% ./gc_str
2008-02-06 07:43:06.732 gc_str[17181:807] ptr 'Hello, world',
description 'Hello, world'
[johne@LAPTOP_10_5] /tmp%
Having the class type allows some extra compile time diagnostics to
take place, such as sending messages to a class that aren't defined in
its @implementation, but this is nothing but sugar for you and me.
Awfully useful sugar to you and me, no doubt about it, but sugar none
the less.
As the code above shows, your assertion that "The compiler itself has
to be able to tell the difference between them (pointers) to generate
correct code" is obviously wrong.
The syntax therefore really isn't that ambiguous: If there has
been an @interface or @class declaration for it, it's an instances
of a class, otherwise it's something else.
No! This is utterly wrong.
Look at the code above. By your logic, it is incapable of working
because the compiler doesn't know it's an object.
This is not some minor point to be glossed over. It forms a core part
of this entire argument.
If you do not understand the fact that "NSString *string" is equal to
"void *string", you can not understand the complete ramifications of
what the type qualifier __strong is doing (or not doing).
Once you understand that "NSString *" and "void *" are the same thing,
just a pointer, you are forced to ask "Where is the __strong qualifier
magically coming from?"
And before you can answer that, you are forced ask "Wait wait wait,
why did the compiler allow an unqualified void * pointer to be
assigned a more qualified void * pointer, against ANSI-C type
qualifier rules?"
This will be quickly followed by ".. and by silently dropping the
__strong qualifier, that means pointer assignments which are critical
to the proper operation of the collector are getting silently
discarded left and right."
You'll know it when you get it because this will be followed by a
quick mental estimation of just how often this error is occurring in
the entire code base, along with the sensation of the floor dropping
out from underneath you while simultaneously feeling what can only be
described as a baseball bat being cracked over the back of your head.
In that slow motion, car crash time dilation effect, you'll notice
yourself slowly uttering the words:
"Oh... Shit..."
In practice for many developers this isn't a significant issue. I
don't recall having seen any Cocoa code which used "char *" to store
an object pointer, for example. There are few places in idiomatic
Cocoa where you might commonly use a "void *" to store an object
pointer, and in those situations it's straightforward (and correct
under non-GC as well) to introduce a CFRetain of the object before
storing into the "void *" and a CFRelease of the object after the
"void *" is no longer relevant. (Under GC, CFRetain effectively
adds an extra root while CFRelease removes one.)
One of the places I do this is in code that presents a sheet,
because it returns control to the main run loop. If I have to pass
an object as the "(void *)context" parameter to the sheet
invocation, I CFRetain it first. Then in the sheet's did-dismiss
selector (if it has one) or did-end selector (if there's no did-
dismiss), I CFRelease the object. This ensures that "the sheet"
acts as a root for the object in case it's transient and just being
used to pass information around.
No, what you're obviously doing is compensating for a buggy and flawed
GC system which is randomly reclaiming live data, and you're hacking
around the root problem. one. pointer. at. a. time. From my
experience, this is only after hours, usually days, of debugging of
trying to find out why every once in awhile displaying a sheet causes
a crash.
Your example pretty much epitomizes my experience with Cocoas GC
system. I spend far, FAR more time debugging screwy problems like the
one described, only to have to come up with some god awful hacky
kludge to get around the problem. The very problem GC is supposed to
be fixing and freeing me from dealing with so I can spend my time on
real problems.
[gcConstTitle setTitle:"Hello, world!"];
[gcUTF8Title setTitle:[[NSString stringWithUTF8String:"Hello, world
\xC2\xA1"] UTF8String]];
[[NSGarbageCollector defaultCollector] collectExhaustively];
NSLog(@"GC test");
printf("gcConstTitle title: %p = '%s'\n", [gcConstTitle title],
[gcConstTitle title]);
printf("gcUTF8Title title: %p = '%s'\n", [gcUTF8Title title],
[gcUTF8Title title]);
If you build this example non-GC, and you replace
[[NSGarbageCollector defaultCollector] collectExhaustively];
with
[pool drain];
it would be just as incorrect. The object backing that UTF-8 string
is no longer live, therefore you can't trust that the UTF-8 string
itself is valid.
Well, scratching the itch of curiosity, in the non-GC example, does
the pointer for UTF8String come from NSAllocateCollectable? That has
a prototype of void * __strong which indicates that it's from the
collector and therefore requires write barriers? No?
"I'll take 'Not relevant' for $200 and 'Misunderstands the
fundamentals' for the win, Alex."
Your example is flawed on the face of it. retain/release allocation
documentation makes it pretty clear that such pointers are temporary
and are valid only up until the autorelease pool pops. You popped the
pool, therefore your use after that point in time is clearly an error.
A garbage collection systems sine qua non is to free the programmer
from having to deal with the issues memory allocation. What good is a
garbage collection system that requires me to hand hold it every step
of the way, that causes me to spend MUCH more time having to deal with
memory allocation problems than if I'd never used it in the first
place? In the GC example, there is a live pointer to an allocation
that the GC system has reclaimed. That allocation comes from a
function that returns a __strong qualified pointer, and UTF8String has
silently discarded it, and as a consequence, caused a perfectly
legitimate and live pointer to become invisible to the collector.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden