Re: Kevent/Kqueue causing kernel panics
Re: Kevent/Kqueue causing kernel panics
- Subject: Re: Kevent/Kqueue causing kernel panics
- From: Travis Athougies <email@hidden>
- Date: Fri, 5 Nov 2010 21:45:11 -0700
On Fri, Nov 5, 2010 at 8:20 PM, Terry Lambert <email@hidden> wrote:
> This is not a support channel, and as Shantonu said, you should file a bug. That said:
>
I will
> 0x4edaf8 is in unix_syscall64 (/SourceCache/xnu/xnu-1504.7.4/bsd/dev/i386/systemcalls.c:433).
> 0x4701dc is in kevent (/SourceCache/xnu/xnu-1504.7.4/bsd/kern/kern_event.c:1225).
> 0x46fdcc is in kevent_internal (/SourceCache/xnu/xnu-1504.7.4/bsd/kern/kern_event.c:1288).
> 0x29d59a is in usimple_lock (/SourceCache/xnu/xnu-1504.7.4/osfmk/i386/locks_i386.c:352).
> 0x21b455 is in panic (/SourceCache/xnu/xnu-1504.7.4/osfmk/kern/debug.c:307).
>
Thank you very much for the stack trace, I couldn't seem to find the
debug symbols for my kernel from the developer site. This will
definitely allow me to run my program while the bug is being fixed
> You are getting a simple lock timeout. Simple locks spin for a very short time, and then if they can not be acquired in that time, they cause a panic.
>
Thanks for this
> In this case, the simple lock in question is a call to kqlock(kq); at line 1287 of bsd/kern/kern_event.c in kevent_internal() called from kevent().
>
Wow, thanks.
> The lock is failing because it can't be acquired in the requisite time. Most probable causes, IMO, are attempting to run the non-server version of MacOS X under virtualization (virtualization can impact the absolute time locks are held, as well as the accuracy of the clocks by which the interval is measured), and a multithreaded program that is closing the kqueue in question out from under the kevent() system call. There are also other possibilities.
>
Running Mac OS X standard under a newly bought macbook pro, so no. And
I never close the kqueue. Do you mean closing a file descriptor
associated with the kqueue?
> If I had to guess (which I do, since the only way to get sufficient information to fix the issue in this case is to do two machine debugging and/or set up a core dump server), I would have to say that since this is being ported from Linux, the cause is that the queue is getting closed by another thread under the mistaken assumption that since epoll is a cancellation point (i.e. the close will abort the system call), kevent on Mac OS X is as well. It is not.
>
Could you point me somewhere that would tell me how to get this set
up? I have read the Apple Documentation. Is there anything else I
should know?
And no, this is definitely not being ported from linux. It's the other
way around. I liked the BSD event system better, so I thought I'd do
my main development on my macbook. However, that failed miserably, so
I had to move to linux so I could actually run my program.
> You should file a bug on this, but if you do, without including either the output of "showallstacks" from a two machine debug session, or a kernel core dump, OR including the sources necessary to reproduce the problem locally (assuming it DOES reproduce locally), it will end up as "Can Not Reproduce". So include the necessary information, or be prepared to be asked for it.
>
Thanks for the tip.
> Either way, since it won't result in cancellation of the outstanding event, you should rethink your code to at least maintain a container structure in user space with a retain/release count on the kq to avoid closing the queue out from under it. If you need cancellation for the proper function of your program, you need to familiarize yourself with pthread_self() to save the kevent thread's thread id and pthread_kill() to allow a cancellation method on the container structure as a result of a desire to close the queue.
>
Like I said, I never release the kqueue, so I don't know how that
could cause it.
> -- Terry
>
> On Nov 5, 2010, at 6:07 PM, Travis Athougies wrote:
>> I have a application that uses the kevent and kqueue API to provide
>> asynchronous events to my application. Essentially, my program is
>> completely event driven. I have my own event handling system which
>> uses a global event queue to distribute events out to a number of
>> threads (as many as the number of processors in the system).
>> Periodically, kevent and kqueue are called to collect events on a
>> number of sockets (I've only tested with one socket, since I can't do
>> any more without a kernel panic), and then add these events to the
>> global event queue. The event queue system works fine (it can handle
>> thousands of simultaneous connections on linux using epoll*
>> functions), however on mac, regardless of what I do, if I run my test
>> program I get a KERNEL PANIC. Application bugs are one thing, but they
>> should NEVER cause a kernel panic. Just wanted to know if anyone else
>> has gotten this error and if they have, how they went about resolving
>> it. The kernel panic message is below and I can provide source code if
>> necessary.
>>
>> I'm using the Boehm GC with pthreads, if that means anything to you
>>
>> And this isn't a once in a while thing. This is every time I run the
>> application, within a few test client runs.
>>
>> I've attached a kernel panic. I can provide sources, but keep in mind
>> I'm not ready to make this software open-source.
>>
>> And the panic is always in the same place: locks_i386.c.
>>
>> --
>> Travis Athougies
>> <Kernel_2010-11-05-151425_Travis-Athougiess-MacBook-Pro.panic> _______________________________________________
>> Do not post admin requests to the list. They will be ignored.
>> Darwin-kernel mailing list (email@hidden)
>> Help/Unsubscribe/Update your Subscription:
>>
>> This email sent to email@hidden
>
>
--
Travis Athougies
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden