Re: Kernel panics even in OSMalloc_noblock() while holding a spinlock

8 Aug 2009

      site_archiver@lists.apple.com
Delivered-To: darwin-kernel@lists.apple.com

-- Terry
On Aug 7, 2009, at 11:03 PM, Brendan Creane <bcreane@gmail.com> wrote:
"...no allocation or free inside a spinlock..."
That's actually new information -- previous posts
(http://lists.apple.com/archives/darwin-kernel/2008/Jan/msg00007.html)
do indeed suggest OSMalloc_noblock() as a solution for allocating
memory from within a spinlock.
My reading of OSMalloc_noblock() is that it may block when obtaining
the zone mutex, but it won't wait for any other threads to complete a
memory allocation. So it's more like "OSMalloc_fast()." And yes, it
may panic inside a spinlock, just like OSMalloc().
For those of us porting existing code from operating systems that do
support memory allocation from within lock primitives (Linux,
Windows), we need to be aware that Darwin's design precludes this
approach and consequently may require a significant rewrite of
existing drivers.
regards, Brendan

Hello there -
void alloc_thread(void *arg) {
  lck_spin_t* mylock;
  // ... initialize lock
 lck_spin_lock(mylock);

[ ... ]

Yes.  You are missing several things...
(B) An infinite loop is too long a time to hold a spinlock; holding a
spinlock too long is known to cause a panic
-- Terry

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (Darwin-kernel@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/darwin-kernel/bcreane%40gmail.com
This email sent to bcreane@gmail.com
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (Darwin-kernel@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a...
That's not really an accurate representation. As long as no one else
tries to acquire the spinlock while you hold it, there's no chance of
panic; you could hold the thing for years and there would be no
panic.  You want that post to be true? Then quit reentering on the
lock; magically, it's now true.
As soon as it's contended, though, the countdown clock starts.
Correctly written, scalable code will minimize contention and minimize
lock hold times, no matter what OS it's initially written to run on.
In case it's not clear from that: locks in parallel systems need to
protect data, NOT critical sections of code. The more parallel the
hardware becomes, the more important that idea is. Doing a CLI,
running a bunch if code, then doing an STI is not the path to the
future.
Portable code, on the other hand, will not rely on specific types of
synchronization primitives; the difference between a spinlock and a
mutex is not about what it does as a synchronization primitive, but
about what it costs you to do it (so long as you do it fast!). Picking
a semantically identical (assuming multiple processors or kernel
preemption) primitive based on cost rather than on a considered
examination of the problem is likely the wrong thing to do. That's
like saying "I'll use vfork() because it's just like fork(), only
faster". Yeah, it doesn't copy the address space, but there are other
tradeoffs besides that one. If you come into it with your eyes open,
it's not going to be hard to do a driver port from any system to any
other system; it's not like learning new customs on an alien planet:
hardware people (and the laws of physics they have to obey) are just
not that imaginative.
On Fri, Aug 7, 2009 at 7:54 PM, Terry Lambert<tlambert@apple.com>
wrote:

On Aug 7, 2009, at 3:50 PM, Eric Ogren wrote:

 I am working on a kernel extension that sometimes attempts to
allocate

memory while holding a spinlock (lck_spin_t). I've read in several
postings

to this list that doing so while OSMalloc can cause a kernel
panic, and that

developers should instead use OSMalloc_noblock() and be prepared
to deal

with a NULL result. However, I am seeing  kernel panics even when
calling

the noblock variant. The panic is occurring inside zalloc_canblock
() at the

lock_zone() call, which is indeed trying to block. This problem is
very

easily reproducible with a simple kext that spawns 2 threads with
the

following thread routine:
 while (true) {

    void* foo = OSMalloc_noblock(8, tag); // tag is a global
variable

initialized in the start function

 }

}
 Loading the kext panics the system almost immediately with the
following

stack, which is the same as the stack as I come across in the
occasional

panics:
#2  0x0012b4c6 in panic (str=0x1 <Address 0x1 out of bounds>) at

/SourceCache/xnu/xnu-1228.12.14/osfmk/kern/debug.c:275

#3  0x001368fd in thread_invoke (self=0x43018b8, thread=0x39a7c80,

reason=0) at /SourceCache/xnu/xnu-1228.12.14/osfmk/kern/
sched_prim.c:1477

 Am I missing something here, or is it unsafe to even call

OSMalloc_noblock() while holding a spinlock? If I look at the
source code

for zalloc, lock/unlock_zone() is always called regardless of the
canblock

parameter, and that zone lock is indeed a mutex. Seems almost like
the

canblock parameter just means that the calling thread will not
block for a

long time (ie will not try to refill or garbage collect the zone
if it's

full), not that it will never block at all.

(A) You're in a tight while/true loop allocating all of kernel
memory a tiny

bit at a time until you exhaust it, which is known to cause a panic.
(C) You're missing your paste of at least the lines above the
portion of the

backtrace you quoted (which is cut off at frame #2), which would
include the

actual panic message and frame #1 so that we could see if the panic
was

related to holding a spinlock or related to you exhausting the zone
of

zones, or running the wrong version of Parallels or dereferencing a
NULL

pointer, or some other known cause of panics, or some other unknown
cause of

panics.
(D) You have already been told you can use Djikstra's algorithm, in
which

you speculatively do an allocation before holding the spinlock, and
if you

use it, fine, mark it as consumed at the point you would have done
your

allocation, and if you don't, fine, free it after you have dropped
your

spinlock, no harm no foul, no allocation or free inside a spinlock
This email sent to site_archiver@lists.apple.com

Terry Lambert

tags

participants (1)