Race condition in libkern's OSAtomicDequeue()
Race condition in libkern's OSAtomicDequeue()
- Subject: Race condition in libkern's OSAtomicDequeue()
- From: Colin Hirsch <email@hidden>
- Date: Tue, 6 Dec 2005 14:51:34 +0100
Hi,
playing with all kinds of SMB-safe atomic operations I stumbled over the
OSAtomicDequeue() and ...Enqueue() operations in the kernel source code.
The reason I was looking there was that my first singly-linked-list kept
crashing immediately (in a test program that does nothing but enqueue()/
dequeue() in 3 threads on my Quad G5). However the libkern function has
the same problem, as explained below and as can be verified easily with
the source linked below.
But first the issue:
In xnu-792.6.22/libkern/gen/OSAtomicOperations.c the following code is
used for the dequeue. I believe the problem to be that the value in
newListHead is not necessarily current by the time the atomic CAS is
executed.
This is a pretty contrived case, as it requires one thread to be
preempted between the assignment to newListHead and the CAS, and
then two (or more) elements must be dequeued and at least the first
dequeued element be put back again [last] so that the CAS succeeds.
void * OSDequeueAtomic(void ** inList, SInt32 inOffset)
{
void * oldListHead;
void * newListHead;
do {
oldListHead = *inList;
if (oldListHead == NULL) {
break;
}
newListHead = *(void **) (((char *) oldListHead) + inOffset);
} while (! OSCompareAndSwap((UInt32)oldListHead,
(UInt32)newListHead, (UInt32 *)inList));
return oldListHead;
}
In practice, my test program
http://www.colin-hirsch.net/source/list_broken.cc
which uses the same list operations as libkern, crashes immediately, whereas
the "fixed" version
http://www.colin-hirsch.net/source/list_fixed.cc
has been running for several minutes now, so it might actually be correct ;-)
The crux is to temporarily take ownership of the complete list during dequeue()
to prevent other CPUs from doing the aforementioned sequence of dequeues()s and
enqueue()s, and then giving the rest of the list back (i.e. the list minus the
dequeued head).
I would be grateful if somebody (at Apple?) could double-check my analysis and
the fixed version and, if applicable, crate a patch for libkern... (My fixed
version is not fit for general purpose use since it does not detect an empty
list.)
Regards, Colin
PS: Yes, the commented "isync" in the fixed version is at the wrong place.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden