Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
- Subject: Re: Help DTrace gurus: suggestions for capturing a mis-allocated NSData object on a customer's system
- From: Ken Thomases <email@hidden>
- Date: Sun, 21 Nov 2010 07:13:33 -0600
On Nov 20, 2010, at 11:58 AM, James Bucanek wrote:
> I think the term "uninitialized" in this context is a valgrind concept. Valgrind runs your application in a simulator. It associates a "valid-value" bit with every byte of memory. If your application ever tries to read a byte of data that hasn't been written yet, it catches it.
I was aware of that.
>> Which line of your real code corresponds to:
>>
>> ==78805== Uninitialised value was created by a stack allocation
>> ==78805== at 0x1000E4BA7: -[PackageNames readNamePackagesOp] (PackageNames.m:590)
>
> The line in question is:
>
> NSData* data = [NSData dataWithBytes:&batch length:offsetof(ReadBatch,set)+sizeof(PackageNameRecord)*batch.count];
>
> Valgrind knows that &batch refers to an address that was "created by a stack allocation".
My impression from the Valgrind report is that it's talking about a stack allocation that occurs on the mentioned line. The allocation of batch does not occur at that line.
What possible uninitialized value could be "created by a stack allocation at" that line?
> When the function returns, valgrind marks all the bytes in that stack frame as invalid. Any future attempt to read or write from those addresses is caught as an error:
>
> =78805== Thread 9:
> ==78805== Conditional jump or move depends on uninitialised value(s)
> ==78805== at 0x1002AAD08: -[NSConcreteData dealloc] (in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation)
> ==78805== by 0x1000EA4B4: -[InsertPackageNamesOp dealloc] (PackageNames.m:1824)
> =
>
> Note that valgrind is not saying that the data was being overwritten, but that NSConcreteData merely "depends on" (a.k.a. "reads") information from an address space that formerly belonged to the stack frame of -readNamePackagesOp, which should now be considered as invalid/uninitialized.
My point is that I suspect something is doing a copy of some uninitialized bytes from a stack variable over top of NSConcreteData's internal state. Later, when NSConcreteData accesses it, it's basing a jump or move on this corrupted data, which Valgrind knows comes from uninitialized data. (Valgrind knows that a copy from uninitialized data propagates the uninitialized-ness, even though the destination bytes have been written to.)
> The body of InsertPackageNamesOp is:
>
> - (void)main
> {
> ...
> const RecordBatch* batch = (const RecordBatch*)[data bytes];
> SortedIndexSlowLock(namesIndex);
> [namesIndex insertCount:batch->count records:batch->set.names excludingDuplicates:YES];
"batch->set.names"? Isn't batch->set an array?
I suspect this is a confusion based on your simplification of your code. If batch->set is a struct, which contains an array field called "names", then this code makes sense. However, your calculation of the length for the NSData:
offsetof(ReadBatch,set)+sizeof(PackageNameRecord)*batch.count
no longer does. I would think it should be something like:
offsetof(ReadBatch,set.names)+sizeof(PackageNameRecord)*batch.count
or, my preference, to assure all of the types are and remain correct:
offsetof(typeof(batch),set.names)+sizeof(batch.set.names[0])*batch.count
> SortedIndexSlowUnlock(namesIndex);
> }
>
> 'records:' is a (const PackageNameRecord*) parameter (I simplified this to just Record* in the earlier code example). The contents of this array are never modified by -insertCount:records:excludingDuplicates:.
If the NSConcreteData were still referring to the 'batch' variable from the stack of -readNamePackagesOp, then you'd be getting Valgrind errors within -[InsertPackageNamesOp main], wouldn't you?
You can try adding logging of the bytes within the data object from within both methods. Also, log the object address in each, the address of the 'batch' variable in -readNamePackagesOp, and the -bytes address in -main.
By the way, from which version of Mac OS X were your Valgrind results and the crash log obtained? The Valgrind results give a location within -[NSConcreteData dealloc] of 0x1002AAD08. The crash log gives an address of 0x00007fff8637dd48.
Both addresses seem like they're 64-bit, but I think they're both 32-bit. From my Mac OS X 10.6.4 (10F569) system, adjusting for relocation, the ...d48 address makes sense. It's right after the call from within -[NSConcreteData dealloc] to __NSGenericFree, which jumps to free() (and so disappears from the backtrace).
On the other hand, the ...d08 address doesn't correspond to an instruction. It's in the middle of an instruction within the function prologue. I suspect you have a different version of Mac OS X, with a different Foundation framework. I'd be curious to know exactly what -[NSConcreteData dealloc] is doing when Valgrind complains.
Lastly, in the modified code you posted for -readNamePackagesOp, there was a reference to "eof" and a placeholder for "<read a record from the file>". I wonder what happens if there's an incomplete record. Could the inner 'record' variable actually be incomplete or partially uninitialized?
Regards,
Ken
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden