An interior pointer is a pointer to memory(2) occupied by an object which does not point to the start location. Also called a derived pointer when it's derived from a base pointer.
"Mac OS X" interior pointer:
struct MYPtrStruct { size_t length; char *charPointer; }
struct MYPtrStruct *newPtrStruct(size_t newLength) {
struct MYPtrStruct *ptrStruct = calloc(1, sizeof(struct MYPtrStruct));
ptrStruct->length = newLength;
ptrStruct->charPointer = calloc(1, newLength);
return(ptrStruct);
}
char *getCharPtr(struct MYPtrStruct *ptrStruct) { return(ptrStruct->charPointer); }
struct MYPtrStruct *ptrStruct = newPtrStruct(1024);
char *interiorPtr = getCharPtr(ptrStruct);
// The implication being that: If the pointer from ptrStruct->charPointer is found to be live THEN ptrStruct is live too.
// or, if ptrStruct->charPointer is found to be live, that pointer has an implicit strong reference to ptrStruct.
// IMHO, this is a completely inappropriate use of "interior pointer". A better choice would be something like: inverse pointer, (inverse?) associative pointer reference, etc
"Common" interior pointer:
char *p = calloc(1, 1024);
char *interiorPointer = &p[23];
Basically, the Mac OS X usage is "wrong" (for some value of wrong). In the above example, there is essentially a directed acyclic graph of pointers. Mac OS X uses the term "interior pointer" to describe the DAG's "one-way" nature, and that there is no inverse relationship present to keep the parent [allocation | object] alive based solely on the existence of the child pointer. Most of the time being able to reap ptrStruct while keeping charPointer live is the right thing. GC systems that support these kinds of inverse relationships usually require a GC API call to register that relationship (and that call would be placed in newPtrStruct() in the above example).
The reason why base pointers are distinguished from interior pointers in garbage collectors is a garbage collector that only has to recognize base pointers is usually easier and faster when scanning than one that also supports recognition of interior pointers. Essentially, base pointers is a check for an exact match, while interior pointers is a "range" match, which is less efficient. Hence, whether or not a garbage collector "supports" interior pointers is usually clearly stated, along with any limitations- for example, the degree of interior pointers support is tunable in the boehm collector (ie, no interior pointers will be found in the heap, but the stack + regs is always checked).
So, since the Mac OS X documentation uses the term "interior pointer" in a totally non-standard way, and I can't find anything wrt/ to what I'm looking for, my question is:
What are the caveats, if any? For example, only supports interior pointers for some subset of areas that are scanned (registers, stack, heap).
Any complicated gotchas? Particularly around the usage of __weak pointers.
Consider the following example:
#import <Foundation/Foundation.h>
#include <objc/objc-auto.h>
__weak const void *p;
void checkGC(__weak const char cString[]) {
char c = cString[1];
NSLog(@"Did it work: > %c < '%s'", c, cString);
}
int main(int argc, char *argv[]) {
p = [@"Hello, world!" UTF8String];
__strong const char *cp = NULL;
cp = &(((const char *)p)[1]); /* [0] == works, [1] == breaks */ // XXX
p = [@"Goodbye, world!" UTF8String];
objc_collect(OBJC_EXHAUSTIVE_COLLECTION | OBJC_WAIT_UNTIL_DONE);
checkGC(cp);
return(0);
}
When run under 10.6 (10A432) with XXX set to 0 (zero):
shell% gcc -arch i386 -o gc gc.m -framework Foundation -fobjc-gc-only
shell% ./gc
2009-09-06 19:11:59.926 gc[25196:903] Did it work: > e < 'Hello, world!'
When run under 10.6 (10A432) with XXX set to 1 (one):
shell% gcc -arch i386 -o gc gc.m -framework Foundation -fobjc-gc-only
shell% ./gc
2009-09-06 19:12:08.136 gc[25205:903] Did it work: > ˇ < 'ˇˇˇØ ˛˛'
When run under 10.6 (10A432) with XXX set to 1 (one) and objc_collect() commented out:
shell% gcc -arch i386 -o gc gc.m -framework Foundation -fobjc-gc-only
shell% ./gc
2009-09-06 19:21:55.978 gc[25245:903] Did it work: > l < 'ello, world!'
This would seem to be evidence that the Mac OS X garbage collector does not support interior pointers (using the "common" definition).
I certainly hope this isn't the case because this essentially means that it is fundamentally impossible to write programs that execute in a deterministic fashion when using GC. It is basically impossible to write code that guarantees that the base pointer remains visible to the collector, particularly when __weak and/or the optimizer is used. Things work the vast majority of the time because there is (usually) a very small window of time where it could actually cause a problem and one of two things are true: 1) When the compiler generates code that, as a side effect, only uses interior pointers (this happens much, much more frequently than you would think), there's "something" that covers this fundamental error with a base pointer. This is invariably due to a happy set of coincidences and rarely the explicit, intentional result of the programmer. 2) The collector is not collecting during the window of vulnerability.