Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
- Subject: Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
- From: Christiaan Hofman <email@hidden>
- Date: Thu, 25 Nov 2010 10:57:49 +0100
On Nov 25, 2010, at 4:20, James Bucanek wrote:
> Ken Thomases <mailto:email@hidden> wrote (Sunday, November 21, 2010 6:00 PM -0600):
>
>> On Nov 21, 2010, at 12:56 PM, James Bucanek wrote:
>>
>>> Ken Thomases <mailto:email@hidden> wrote (Sunday, November 21, 2010 6:13 AM -0600):
>>>
>>>>> Note that valgrind is not saying that the data was being overwritten, but
>>>>> that NSConcreteData merely "depends on" (a.k.a. "reads") information from an
>>>>> address space that formerly belonged to the stack frame of
>>>>> -readNamePackagesOp, which should now be considered as invalid/uninitialized.
>>>
>>>> My point is that I suspect something is doing a copy of some uninitialized
>>>> bytes from a stack variable over top of NSConcreteData's internal state.
>>>> Later, when NSConcreteData accesses it, it's basing a jump or move on this
>>>> corrupted data, which Valgrind knows comes from uninitialized data.
>>>> (Valgrind knows that a copy from uninitialized data propagates the
>>>> uninitialized-ness, even though the destination bytes have been written to.)
>>>
>>> But what code could that be? The NSData object's internal variables are
>>> allocated on the heap by +alloc. The only reference my code has to that is
>>> the NSData* data variable.
>>>
>>> There are only three other references to this variable after it's created,
>>> and those are all used to send it a message. The only messages sent are, in
>>> order:
>>>
>>> -autorelease
>>> -retain
>>> -bytes
>>> -release
>>>
>>> That's total extent of interaction with the NSData object.
>>
>> Well, I had been thinking that a) NSConcreteData might store the data buffer
>> inline, using some obscure technique, like the extraBytes parameter of
>> NSAllocateObject;
>
> It does, and I think this is the source of the uninitialized value.
>
> I had some time to experiment again this evening, and found that for allocations less than 16K, the NSConcreteData class stores the data in-line. The -bytes property always returns an address 32 bytes offset from the address of the NSData* pointer. Details in a second...
>
CoreFoundation is (mostly) open source, and NSData is (normally) just an obj-c 'alias' for CFData (toll-free bridged). So you can easily check your hypothesis by looking into the code.
Christiaan
>> and b) overrunning or underrunning the buffer would thus
>> clobber the object's internal state. That would have required writing to the
>> read-only contents, which you've said is definitely not happening.
>
> I agree. It fits the symptoms so well, that I was unwilling to take this off the table; On the other hand, I just couldn't see how it could be getting corrupted.
>
>> So, it's
>> probably a dead end.
>
> Probably, but not for those reasons.
>
> I wrote some code to dump the address of the NSData objects, the -bytes property, the size of the calculated structure, and (using some crude introspection) dumped the internal structure of the NSData object to a log file.
>
> I found that for large (>16K) allocations, NSData malloc()'s a block to store the data and keep a reference to it.
>
> This is how the NSData object appears immediately after it's created:
> 2010-11-24 19:26:42.297 -0700 #debug# -[PackageNames readNamePackagesOp] batch=0x10077cbd0->0x111008000, data=0x10f30cc60, count=1024, length=16400, *data=NSConcreteData<0x10f30cc60>{_isInline=(b1)2,_shouldFree=(b1)2,_hasVM=(b1)2,_retainCount=(b29)2,_length=16400,_capacity=16400,_bytes=(v*)0x111008000,_space=type([12C])} [1.734100.6572.24]
>
> batch: the address of the RecordsBatch structure on the stack, and its new address returned by [data bytes]
> data: the address of the new NSData object
> count: the number of records in RecordsBatch
> length: the length of the RecordsBatch header structure + the number of records (this is all correct, I checked the calculations by hand)
> *data: is a dump of the NSData object using some Obj-C introspection. (The Obj-C Ivar stuff doesn't break apart bit fields, so that's why the first few values are all the same.)
>
> See that the _bytes field points to the allocated block of memory that holds the copy of the original buffer.
>
> Now, just before the object gets released, it looks like this:
> 2010-11-24 19:26:42.303 -0700 #debug# -[InsertPackageNamesOp dealloc] batch=0x111008000, data=0x10f30cc60, count=1024, length=16400, *data=NSConcreteData<0x10f30cc60>{_isInline=(b1)2,_shouldFree=(b1)2,_hasVM=(b1)2,_retainCount=(b29)2,_length=16400,_capacity=16400,_bytes=(v*)0x111008000,_space=type([12C])} [1.734100.6572.37]
>
> Basically, everything is the same.
>
> Similarly, when the allocation is small, this is what happens:
> 2010-11-24 19:26:42.295 -0700 #debug# -[PackageNames readNamePackagesOp] batch=0x10df28bd0->0x10152d7a0, data=0x10152d780, count=7, length=128, *data=NSConcreteData<0x10152d780>{_isInline=(b1)1,_shouldFree=(b1)1,_hasVM=(b1)1,_retainCount=(b29)1,_length=128,_capacity=128,_bytes=(v*)0x0,_space=type([12C])} [1.734100.6572.21]
>
> 2010-11-24 19:26:42.297 -0700 #debug# -[InsertPackageNamesOp dealloc] batch=0x10152d7a0, data=0x10152d780, count=7, length=128, *data=NSConcreteData<0x10152d780>{_isInline=(b1)1,_shouldFree=(b1)1,_hasVM=(b1)1,_retainCount=(b29)1,_length=128,_capacity=128,_bytes=(v*)0x0,_space=type([12C])} [1.734100.6572.23]
>
> As you can see, the internal _bytes value is zero, and the address returned by -bytes (0x10152d7a0) is 32 bytes after the address of the NSData object (0x10152d780).
>
> By dumping the internal structure of the NSData object immediately after it was created and again just before it was destroyed, I was able to convince myself that the NSData object is not getting corrupted.
>
> Now, here where things get weird, and where I was sure there was a problem. After a few of the inline NSData objects are created, I start to get ones that look like this:
> 2010-11-24 19:26:42.297 -0700 #debug# -[PackageNames readNamePackagesOp] batch=0x10df28bd0->0x10152b990, data=0x10152b970, count=38, length=320, *data=NSConcreteData<0x10152b970>{_isInline=(b1)1,_shouldFree=(b1)1,_hasVM=(b1)1,_retainCount=(b29)1,_length=320,_capacity=320,_bytes=(v*)0x1,_space=type([12C])} [1.734100.6572.22]
>
> 2010-11-24 19:26:42.298 -0700 #debug# -[InsertPackageNamesOp dealloc] batch=0x10152b990, data=0x10152b970, count=38, length=320, *data=NSConcreteData<0x10152b970>{_isInline=(b1)1,_shouldFree=(b1)1,_hasVM=(b1)1,_retainCount=(b29)1,_length=320,_capacity=320,_bytes=(v*)0x1,_space=type([12C])} [1.734100.6572.25]
>
> Same as before, but note that the internal _bytes field is 0x1 (not 0x0). A bad address for sure. And this seems to be the cause of the valgrind warnings, because I soon as my introspection code executes it spits out this error:
>
> ==5787== Thread 8:
> ==5787== Conditional jump or move depends on uninitialised value(s)
> ==5787== at 0x100EBD1F5: __vfprintf (in /usr/lib/libSystem.B.dylib)
> ==5787== by 0x100EE475F: snprintf_l (in /usr/lib/libSystem.B.dylib)
> ==5787== by 0x101132C76: _CFStringAppendFormatAndArgumentsAux (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
> ==5787== by 0x10113127C: _CFStringCreateWithFormatAndArgumentsAux (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
> ==5787== by 0x100745FC0: -[NSPlaceholderString initWithFormat:locale:arguments:] (in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation)
> ==5787== by 0x1007483A2: +[NSString stringWithFormat:] (in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation)
> ==5787== by 0x1000EC49A: ObjectDetail (PackageNames.m:2044)
> ==5787== by 0x1000E5244: -[PackageNames readNamePackagesOp] (PackageNames.m:693)
> ==5787== by 0x10118596B: __invoking___ (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
> ==5787== by 0x10118583C: -[NSInvocation invoke] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
> ==5787== by 0x10084DD93: -[NSInvocationOperation main] (in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation)
> ==5787== by 0x10076FDE3: -[__NSOperationInternal start] (in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation)
> ==5787== Uninitialised value was created by a stack allocation
> ==5787== at 0x1000E4A36: -[PackageNames readNamePackagesOp] (PackageNames.m:595)
>
> The same error it was complaining about -[NSData dealloc], but this time it's complaining about my ObjectDetails function, at the point where it is using object_getIvar() to read the value of the _bytes variable.
>
> So here's my guess as to what's going on: When +[NSData dataWithBytes:length:] is given a buffer that's less than 16K, it calls a special allocation routine that mallocs (not callocs) the space for the object and all of the data. It then copies the data into itself, but never sets the _bytes variable. When the object is released, the code probably reads the _bytes variable in some common dealloc code, but then decides that's there's no memory block to release. But valgrind doesn't know that. It just sees the code load the _bytes property and it knows that it's uninitliazed.
>
> The reason the problem went away when I switch to malloc blocks, or always copied the entire structure, is because that always forces NSData to create a reference to an external memory block.
>
> So it would appear that this entire exercise was chasing a red hering, although I did learn a lot about valgrind. :)
>
>> Have you considered Valgrind's Ptrcheck tool? I don't actually know if it's
>> available for Mac OS X, but if it is, it could provide a different view on
>> this problem.
>
> I will pursue that next.
>
> James
>
>
> James Bucanek
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden