Process Signal Bug On Intel Dual Core Machines?
Process Signal Bug On Intel Dual Core Machines?
- Subject: Process Signal Bug On Intel Dual Core Machines?
- From: Markus Hanauska <email@hidden>
- Date: Mon, 28 Aug 2006 16:46:29 +0200
Hello!
I have a daemon process, that has a signal handler which listens for
SIGHUP, SIGINT, SIGTERM, SIGUSR1, SIGUSR2, SIGPIPE. The problem is,
this process never reacts to SIGTERM, which it does handle. Funny
thing is, it does not react to SIGQUIT, although it does not block it
(the default action should take place). It does however react to SIGINT.
Don't get me wrong: The problem is, that the signals are really *not*
delivered to the process! I can prove it with GDB. E.g. I start the
process, it gets ID 972.
Now here's my GDB session:
~ root# gdb
GNU gdb 6.3.50-20050815 (Apple version gdb-609) (Fri Jul 28 05:21:24
UTC 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i386-apple-darwin".
(gdb) attach 927
Attaching to process 927.
Reading symbols for shared libraries . done
Reading symbols for shared libraries .............. done
0x900f9294 in __select ()
(gdb)
I'll now take a loot at the signal table in GDB:
(gdb) info signals
Signal Stop Print Pass to program Description
SIGHUP Yes Yes Yes Hangup
SIGINT Yes Yes No Interrupt
SIGQUIT Yes Yes Yes Quit
SIGILL Yes Yes Yes Illegal instruction
SIGTRAP Yes Yes No Trace/breakpoint trap
SIGABRT Yes Yes Yes Aborted
SIGEMT Yes Yes Yes Emulation trap
SIGFPE Yes Yes Yes Arithmetic exception
SIGKILL Yes Yes Yes Killed
SIGBUS Yes Yes Yes Bus error
SIGSEGV Yes Yes Yes Segmentation fault
SIGSYS Yes Yes Yes Bad system call
SIGPIPE Yes Yes Yes Broken pipe
SIGALRM No No Yes Alarm clock
SIGTERM Yes Yes Yes Terminated
:
As you can see, it shall stop at every signal and print every
received signal except SIGALRM. Further it should pass all signals
to the app except SIGINT and SIGTRAP.
Now I continue running the app
(gdb) cont
Continuing.
In another shell I do the following:
~ root# kill -TERM 927
What happens in GDB? Nothing. Is the signal handler called? No. I
have verified that by setting a break point at the signal handler in
a previous test. Even if I have no signal handler for TERM, even if I
block that signal or ignore that signal, GDB should still *stop* and
*print* it. It does not. Why not? Because _no signal_ is delivered.
Why not? How can this be?
Ok, let's try QUIT. QUIT is not even handled by my signal handler, it
should do the default action. Here we go:
~ root# kill -QUIT 927
Again, nothing!
Okay, but now, let's try SIGINT:
Program received signal SIGINT, Interrupt.
0x900f9294 in __select ()
(gdb)
Huh? How can this be? How can it be that SIGINT is delivered, but
SIGTERM and SIGQUIT are not?!? Wouldn't GDB show the signal,
regardless if my app ignores or blocks it (what it does not, nowhere
in the code I see anything like this taking place).
Now you may say, how's that a possible kernel bug? Very simple: I
can't reproduce that on any PPC machine. I also can't reproduce that
on my Mac Mini Core Solo, but I can reproduce this to 100% on an iMac
Intel with Dual Core.
This bug is driving me really nuts and lets me doubt my sanity. And
why only on Dual Core Intel? (not in Rosetta, the daemon is
universal) Can it be that this is some kernel layer bug in the signal
delivery?
The work-a-round for me is to use SIGINT on all machines which is
working fine as far as I can see. But this daemon exists since 10.2
and it has always been working, 10.2 to 10.4, on any machine, always
using SIGTERM and now, all of a sudden it fails on iMac and many Mac
Books and Mac Book Pros with Intel Dual Core - not always to 100%
reproducible; for some it's sometimes working and sometimes not -
which makes me believe even more that this is a really, really nasty
kernel bug.
I can provide you with every debug output from GDB, Shark or any
other tool you like. I just can't post any source here. Any help is
appreciated.
--
Best Regards,
Markus Hanauska
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden