Eric,
I think you've answered your own question. 8)
The general intention of malloc_nonblock interfaces is that they will not block indefinitely, not that they will never block (since the latter is frequently nigh impossible to guarantee in a preemptive system).
It's generally considered poor design to call out of a code block holding a spinlock; if nothing else it leads to very fragile interfaces.
= Mike
On Aug 7, 2009, at 3:50 PM, Eric Ogren wrote: Hello there -
I am working on a kernel extension that sometimes attempts to allocate memory while holding a spinlock (lck_spin_t). I've read in several postings to this list that doing so while OSMalloc can cause a kernel panic, and that developers should instead use OSMalloc_noblock() and be prepared to deal with a NULL result. However, I am seeing kernel panics even when calling the noblock variant. The panic is occurring inside zalloc_canblock() at the lock_zone() call, which is indeed trying to block. This problem is very easily reproducible with a simple kext that spawns 2 threads with the following thread routine:
void alloc_thread(void *arg) { lck_spin_t* mylock; // ... initialize lock
lck_spin_lock(mylock);
while (true) { void* foo = OSMalloc_noblock(8, tag); // tag is a global variable initialized in the start function
} }
Loading the kext panics the system almost immediately with the following stack, which is the same as the stack as I come across in the occasional panics:
#2 0x0012b4c6 in panic (str=0x1 <Address 0x1 out of bounds>) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/debug.c:275
#3 0x001368fd in thread_invoke (self=0x43018b8, thread=0x39a7c80, reason=0) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/sched_prim.c:1477 #4 0x00136e92 in thread_block_reason (continuation=0x1, parameter=0x0, reason=<value temporarily unavailable, due to optimizations>) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/sched_prim.c:1837
#5 0x00136f20 in thread_block (continuation=0x1) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/sched_prim.c:1854 #6 0x001318f1 in lck_mtx_lock_wait (lck=0x1a2a084, holder=0x3e40410) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/locks.c:601
#7 0x0019d8c1 in lck_mtx_lock () at pmap.h:176 #8 0x001433b0 in zalloc_canblock (zone=0x1a2a07c, canblock=0) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/zalloc.c:883 #9 0x0012fdc2 in kalloc_canblock (size=8, canblock=0) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/kalloc.c:289
#10 0x001301c1 in OSMalloc_noblock (size=8, tag=0x5ab8700) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/kalloc.c:303
Am I missing something here, or is it unsafe to even call OSMalloc_noblock() while holding a spinlock? If I look at the source code for zalloc, lock/unlock_zone() is always called regardless of the canblock parameter, and that zone lock is indeed a mutex. Seems almost like the canblock parameter just means that the calling thread will not block for a long time (ie will not try to refill or garbage collect the zone if it's full), not that it will never block at all.
Any help is appreciated.
thanks, Eric
-- Excellence in any department can be attained only by the labor of a lifetime; it is not to be purchased at a lesser price -- Samuel Johnson
|