Re: mach in signal handler
Re: mach in signal handler
- Subject: Re: mach in signal handler
- From: Steve Checkoway <email@hidden>
- Date: Fri, 2 Feb 2007 03:22:55 -0800
On Feb 2, 2007, at 12:01 AM, Terry Lambert wrote:
OK, realize that this is an oversimiplification that glosses over
some of the finer points....
Thank you for these detailed replies!
The call you want happens to be a raw Mach call.
There's no guarantee that this won't be wrappered in a later
release, though (i.e. there's no promise to not change it).
Understood.
If I remember my graduate OS class correctly, there's some sort of
UNIX server process that sits outside the kernel.
In MacOS X, no. The BSD server is integrated into the same address
space as Mach and the IOKit. All three are linked into the same
address space, and layered IOKit/BSD/Mach, with IOKit symbols
invisible to BSD and Mach, and BSD symbols invisible to Mach.
So MacOS X is not a "UNIX Single Server" style implementation; this
avoids the overhead that would normally occur as a result of
protection domain crossing in a traditional UNIX-on-Mach
implementation.
I had no idea that it was different. When reading some of the mach
papers, I wondered about the overhead incurred as a result of the
domain crossing.
We also do not document (and are not prepared to tie ourselves down
to particular implementation details by documenting) interactions.
Understandable. "The behavior is undefined" is an acceptable level of
documentation.
The net effect is that you may not get instantaneous results, and
you may end up snapshotting the state of the process afterr the
event which raised the signal. The likelihood of this happening
goes way up if your application is multithreads.
It is multithreaded. I'm not sure if it matters, but the threads are
not very tightly coupled.
So you'll get _an_ answer that's a snapshot of the state at the
time the signal handler fired, but whether or not that's what you
wanted depends on how timing sensistive you are when trying to pick
up the information you are trying to get.
I don't think we're all that timing sensitive. What's really going on
is on powerpc, grabbing a stack backtrace is pretty simple as long as
the frame pointer is around. For the x86, the situation is quite a
bit harder. We have some code that works well for x86 linux that
walks the stack using the frame pointer if it exists, and if not,
just just looking at the stack to find values that are valid return
addresses, that is, they point to executable memory just after a
CALL. Just an idea of where the crash happened is useful.
In general, I agree. What we do in this specific case, is build a
crash log including a backtrace (being very careful not to call
forbidden functions), fork a new process, send the data to it and
have it do things like symbol lookups and throw up a dialog
informing the user of the crash, etc.
Then you are probably fine... however...
The absolute best way to do this is to do exactly what
CrashReporter, gdb, or Adobe products do, in order to do crash
reporting.
[...]
To get there from here, I can only tell you to look at the gdb
sources, and see what gdb does for signal management for processes
being debugged when it's compiled for Darwin/MacOS X.
Yikes! I don't know what Adobe products do (I don't think I use any),
but it seems like you're suggesting a second program to listen for
the crash condition.
Signals are persistent conditions, not events, which means that
if something happens that would cause a signal handler to fire
multiple times in rapid succession, then you will only see the
handler fire exactly once.
Once is all we need. We will never return from the handler. I
don't recall if we perform suicide or patricide, but one way or
another the process will never return from the handler. If the
handler is invoked again, we throw up our hands and just die.
It's safe as you've described it, so long as you're not
multithreaded. If you are multithreaded, then see above; another
thread reentering and a signal handler rentering unsafe code are
pretty much the same thing.
Looking at the code, it looks like we are using one other mach call,
thread_suspend() to suspend all threads other than the currently
executing one as it's first act. I'm not totally sure that this is
safe either, but looking at the date I added it, it's been working
for quite some time.
You basically don't know what state your stack is in at the time
it fired, and the sigaltstack function doesn't work the standard
way in Tiger or earlier.
How does it work in Tiger? I know we use it.
It's not fully POSIX conformant (it wasn't intended to be).
The alternate signal stack is not setable/clearable per thread,
it's only per-process, so if you set up a small stack and end up
having a number of signals come in because you are calling system
calls from your signal handler, and a handler only masks the signal
it's currently handling, you could overflow the stack.
The longer the duration of a blocking call, the more likely you are
to run into this situation.
Of course, now that you know about it, you could make your
alternate signal stack, if you have one, much larger and not have a
problem.
If what you are trying to diagnose is a stack overflow, though, and
you call into your own code from the handler - well, obviously, you
could end up trying to use more of what you already don't have, at
which point you'll just crash-crash.
Got it. Thanks once again.
--
Steve Checkoway
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden