• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful


  • Subject: Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
  • From: John Engelhart <email@hidden>
  • Date: Wed, 6 Feb 2008 09:59:43 -0500


On Feb 4, 2008, at 8:21 PM, Chris Hanson wrote:

On Feb 4, 2008, at 4:14 PM, John Engelhart wrote:

However, through some unspecified logic, SOME pointers are elevated to 'Points to live GC data'.

The logic isn't unspecified.

If a variable is of an object type or is of a pointer type with the qualifier "__strong", it refers to something allocated -- and cleaned up -- by the collector. Otherwise, it refers to something not allocated by the collector.

Exactly. So, in my example, since the buffer is allocated by the collector (as demonstrated by the collector not reclaiming the allocation when it is qualified as __strong), and since we can be reasonably sure that allocation came from NSAllocateCollectable, which by its prototype returns void * __strong, my example is 100% bug free.


Unless someone has explicitly typecasted the __strong qualifier away some how, but that would be a bug on their part because, as you've said, __strong refers to something that's allocated by the collector, which the pointer from UTF8String clearly is.

Thankfully these rules are super simple and virtually impossible to get wrong.

So, as you've said, __strong refers to allocations from the collector, pointers that can be traced back to the collector have a __strong qualifier, which no one will casually discard because it will lead to the collector loosing visibility and creating difficult, hard to find bugs, my const char * pointer assignment is promoted to strong, or by ANSI-C type qualification rules causes a compiler error for improperly discarding the __strong qualifier on assignment, because the NSAllocateCollectable collector allocation that is returned by UTF8String is marked __strong, as its method prototype clearly shows:

- (const char *)UTF8String; // Convenience to return null-terminated UTF8 representation

... Oh... Well, at least the bug isn't mine, the bug is in the prototype for UTF8String, which has erroneously dropped the __strong qualifier to a NSAllocateCollectable collector allocation.

Do I file a bug at this point? Do I also file a bug with everyone who's used the compiler in GC mode and called this method and let them know that due to a buggy declaration in a header, they may experience issues with the pointer returned by UTF8String that may lead to data corruption because the compiler will not emit the required write barriers required for proper operation of allocations from the collector?

I'm really glad this is just a hypothetical problem and doesn't happen in practice, like this UTF8String example demonstrates.


I mean, seriously, can anyone conjure up a compelling reason why the default behavior of a pointer is that it does not point to live data?

Yes. Distinguishing between pointers to collector-allocated objects and non-collector-allocated objects ensures that the collector has far less work to do and can do the work it has more efficiently, because it can have more exact information about what portions of memory it needs to check for strong references to objects.

While you are undeniably right, your justification specious.

Like security, the proper way to analyze the problem is not from the perspective of "if everything goes right", but "what are the consequences on failure."

From this perspective, it becomes a question of "What are the consequences when someone forgets to add __strong? How often do programers make mistakes? How likely is someone to make this mistake? Are there robust, automated checks in place to make sure this doesn't happen?"

The consequences of inadvertently forgetting to add a __strong qualifier when you should are likely to result in random data loss and mysterious crashes, all of which are nearly impossible to trace back to the root cause due to the fact that the problem occurs at some later point in time, far from the source, due to the nature of how the collector works.

Do you see how ridiculous your justification is when stated from this perspective? Can you think of a group of programmers who will take "Fast, but buggy and unstable, and impossible to debug" over "Slow, but rock solid"?

Personally, I don't care how f'ing fast the thing is if it essentially guarantees instability. I've got better things to do with my time then spend days trying to find the cause of some random, non- repeatable crash due to some allocation problem. Ironically, the very thing GC is supposed to prevent.

It also goes a long way towards preventing false roots, which can help keep down the working set of an application that uses garbage collection.

Your choice of words makes me suspect that you consider an object pointer, such as NSObject *, and a C pointer, ala void * or char *, to be two distinctly different things.

They can be; when running under GC, an object is assumed to be allocated by the collector. An arbitrary buffer is not. If you want to use an arbitrary "non-object pointer" type variable to refer to something that is allocated by the collector (e.g. a buffer returned from NSAllocateCollectable) you need to mark that variable with the __strong type qualifier so it gets the same treatment as an object type variable.

By ANSI-C type qualification rules, anything that's returning a pointer from NSAllocateCollectable must be returning that pointer with the __strong qualification. Considering the consequences of dropping that qualification is likely to result in buggy, unstable code, overriding that qualifier by typecasting it out should not be done lightly. There's a reason why '-Wcast-qual' exists.



Pointers are pointers, and knowing which ones to treat as 'special' is non-trivial and easy to get wrong.

In Objective-C it is relatively straightforward to tell which pointer type variables to treat as objects. You have pointers to objects which you can send arbitrary messages to, and pointers to things that aren't objects which you can't send arbitrary messages to. The compiler itself has to be able to tell the difference between them to generate correct code, warnings, and errors.

This is flat out wrong.

[johne@LAPTOP_10_5] /tmp% cat gc_str.m
#import <Foundation/Foundation.h>

int main(int argc, char *argv[]) {
  NSString *aString = NULL;
  void *ptr = NULL;

  aString = [NSString stringWithString:@"Hello, world"];
  ptr = aString;

NSLog(@"ptr '%@', description '%@'", ptr, [ptr description]);
}
[johne@LAPTOP_10_5] /tmp% gcc -framework Foundation -fobjc-gc-only -o gc_str -g gc_str.m
gc_str.m: In function 'main':
gc_str.m:10: warning: invalid receiver type 'void *'
[johne@LAPTOP_10_5] /tmp% ./gc_str
2008-02-06 07:43:06.732 gc_str[17181:807] ptr 'Hello, world', description 'Hello, world'
[johne@LAPTOP_10_5] /tmp%


Having the class type allows some extra compile time diagnostics to take place, such as sending messages to a class that aren't defined in its @implementation, but this is nothing but sugar for you and me. Awfully useful sugar to you and me, no doubt about it, but sugar none the less.

As the code above shows, your assertion that "The compiler itself has to be able to tell the difference between them (pointers) to generate correct code" is obviously wrong.

The syntax therefore really isn't that ambiguous: If there has been an @interface or @class declaration for it, it's an instances of a class, otherwise it's something else.

No! This is utterly wrong.

Look at the code above. By your logic, it is incapable of working because the compiler doesn't know it's an object.

This is not some minor point to be glossed over. It forms a core part of this entire argument.

If you do not understand the fact that "NSString *string" is equal to "void *string", you can not understand the complete ramifications of what the type qualifier __strong is doing (or not doing).

Once you understand that "NSString *" and "void *" are the same thing, just a pointer, you are forced to ask "Where is the __strong qualifier magically coming from?"

And before you can answer that, you are forced ask "Wait wait wait, why did the compiler allow an unqualified void * pointer to be assigned a more qualified void * pointer, against ANSI-C type qualifier rules?"

This will be quickly followed by ".. and by silently dropping the __strong qualifier, that means pointer assignments which are critical to the proper operation of the collector are getting silently discarded left and right."

You'll know it when you get it because this will be followed by a quick mental estimation of just how often this error is occurring in the entire code base, along with the sensation of the floor dropping out from underneath you while simultaneously feeling what can only be described as a baseball bat being cracked over the back of your head. In that slow motion, car crash time dilation effect, you'll notice yourself slowly uttering the words:

"Oh... Shit..."


In practice for many developers this isn't a significant issue. I don't recall having seen any Cocoa code which used "char *" to store an object pointer, for example. There are few places in idiomatic Cocoa where you might commonly use a "void *" to store an object pointer, and in those situations it's straightforward (and correct under non-GC as well) to introduce a CFRetain of the object before storing into the "void *" and a CFRelease of the object after the "void *" is no longer relevant. (Under GC, CFRetain effectively adds an extra root while CFRelease removes one.)


One of the places I do this is in code that presents a sheet, because it returns control to the main run loop. If I have to pass an object as the "(void *)context" parameter to the sheet invocation, I CFRetain it first. Then in the sheet's did-dismiss selector (if it has one) or did-end selector (if there's no did- dismiss), I CFRelease the object. This ensures that "the sheet" acts as a root for the object in case it's transient and just being used to pass information around.

No, what you're obviously doing is compensating for a buggy and flawed GC system which is randomly reclaiming live data, and you're hacking around the root problem. one. pointer. at. a. time. From my experience, this is only after hours, usually days, of debugging of trying to find out why every once in awhile displaying a sheet causes a crash.


Your example pretty much epitomizes my experience with Cocoas GC system. I spend far, FAR more time debugging screwy problems like the one described, only to have to come up with some god awful hacky kludge to get around the problem. The very problem GC is supposed to be fixing and freeing me from dealing with so I can spend my time on real problems.


[gcConstTitle setTitle:"Hello, world!"];
[gcUTF8Title setTitle:[[NSString stringWithUTF8String:"Hello, world \xC2\xA1"] UTF8String]];


[[NSGarbageCollector defaultCollector] collectExhaustively];
NSLog(@"GC test");

printf("gcConstTitle title: %p = '%s'\n", [gcConstTitle title], [gcConstTitle title]);
printf("gcUTF8Title title: %p = '%s'\n", [gcUTF8Title title], [gcUTF8Title title]);

If you build this example non-GC, and you replace

 [[NSGarbageCollector defaultCollector] collectExhaustively];

with

 [pool drain];

it would be just as incorrect. The object backing that UTF-8 string is no longer live, therefore you can't trust that the UTF-8 string itself is valid.

Well, scratching the itch of curiosity, in the non-GC example, does the pointer for UTF8String come from NSAllocateCollectable? That has a prototype of void * __strong which indicates that it's from the collector and therefore requires write barriers? No?


"I'll take 'Not relevant' for $200 and 'Misunderstands the fundamentals' for the win, Alex."

Your example is flawed on the face of it. retain/release allocation documentation makes it pretty clear that such pointers are temporary and are valid only up until the autorelease pool pops. You popped the pool, therefore your use after that point in time is clearly an error.

A garbage collection systems sine qua non is to free the programmer from having to deal with the issues memory allocation. What good is a garbage collection system that requires me to hand hold it every step of the way, that causes me to spend MUCH more time having to deal with memory allocation problems than if I'd never used it in the first place? In the GC example, there is a live pointer to an allocation that the GC system has reclaimed. That allocation comes from a function that returns a __strong qualified pointer, and UTF8String has silently discarded it, and as a consequence, caused a perfectly legitimate and live pointer to become invisible to the collector.
_______________________________________________


Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
      • From: "Hamish Allan" <email@hidden>
    • Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
      • From: "Hamish Allan" <email@hidden>
    • [Moderator] Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
      • From: Scott Anguish <email@hidden>
References: 
 >Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful (From: John Engelhart <email@hidden>)
 >Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful (From: Alastair Houghton <email@hidden>)
 >Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful (From: John Engelhart <email@hidden>)
 >Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful (From: Chris Hanson <email@hidden>)

  • Prev by Date: Re: Apples's code examples... It is me
  • Next by Date: Re: NSOutlineView column with checkbox+image+text
  • Previous by thread: Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
  • Next by thread: [Moderator] Re: Use of Mac OS X 10.5 / Leopards Garbage Collection Considered Harmful
  • Index(es):
    • Date
    • Thread