Re: P_WEXIT
Re: P_WEXIT
- Subject: Re: P_WEXIT
- From: Joseph Oreste Bruni <email@hidden>
- Date: Tue, 15 Aug 2006 07:20:34 -0700
On Aug 15, 2006, at 2:56 AM, Terry Lambert wrote:
On Aug 14, 2006, at 2:57 PM, Joseph Oreste Bruni wrote:
Hello all,
What is the purpose of the P_WEXIT flag? The <sys/proc.h> simply
says "Working on exit."
I have a process that keeps getting stuck in the "E" state (as
shown by the "ps" command) and I can't figure out what it
happening here. My process is being started by launchd and is
supposed to be kept running even if it fails. However, when it
gets stuck in the "E" state, launchd never sees the process
terminate and so won't start another instance. The process is
multi-threaded if that has any relevance.
The P_WEXIT is set when a process has explicitly called exit(), or
has had exit1() called on it, e.g. as a result of taking a fatal
signal (either a SIGKILL, or another signal whose default behaviour
is to terminate the process), or as a result of proc_shutdown(),
which is called on a reboot. It can also be called on a process
that has protected itself from being traced, if you attempt to
attach a trace to it after it has made itself immune from tracing.
The process remains in this state until all active threads have
drained out of the process, and the thread that called the exit()
(or just the last thread, if exit1() was called on behalf of
someone else, or the process was signalled) drains out to the user/
kernel boundary, at which point, if the parent process is not
ignoring SIGCHLD, then a zombie structure is allocated, and the
contents filled un, and it hangs around until the parent process
reaps it by calling one of the wait() functions (e.g. wait4()). If
the parent process is ignoring SIGCHLD, then the process does not
create a zombie, and is immediately reaped.
When you have this situation happening to one of your programs,
usually it's because you have an uninterruptible thread in the
process which is unable to drain out (maybe it's blocked on a
network resource, or maybe it's in a blocking call that can't be
interrupted, or maybe it's stuck in a device driver for a device
you've powered down or unplugged and the driver didn't notice and
drain all pending requests out automatically because it has a bug
or isn't well behaved, etc.).
For multithreaded programs, it's generally best to have a clean
shutdown routine for each of the threads, and shut the process down
in an orderly fashion, rather than simply calling exit() (or taking
a SIGKILL or other fatal signal) and terminating things
abnormally. The normal way to deal with this is a pthread_kill(),
with a pthread_exit() in the exit handler, and a pthread_join() in
the main thread that was just calling exit().
If there is a thread stuck in a driver or other uninterruptible
context, there's not really a lot you can do, other than "don't do
that"/"download a newer driver that doesn't have the bug"/etc..
You may want to ktrace the process and watch it in this situation
(assuming it hasn't disabled tracing on itself; you can always
recompile without that line and trace it anyway, to find out what's
going on). This will give you some idea of where it's stuck. You
can also use various "ps" options to get more info (for example, a
"-M" will dump out thread information for individual threads, so if
the options you choose include the "STAT" column, if you see a "U"
in that column, it means you are in an uninterruptible wait on that
thread).
-- Terry
Hi Terry,
Thank you very much for the explanation. I thought that I had
carefully "joined" all threads, but there might be one that I've
missed. I'll go over everything again.
The way I'm backing out of blocking calls that are not cancelation
points is to have another thread close() the socket out from under
the blocked thread. This results in an EINVALID error which I use to
indicate that the listener thread needs to terminate.
Joe
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
References: | |
| >P_WEXIT (From: Joseph Oreste Bruni <email@hidden>) |
| >Re: P_WEXIT (From: Terry Lambert <email@hidden>) |