Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: P_WEXIT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: P_WEXIT

Subject: Re: P_WEXIT
From: Terry Lambert <email@hidden>
Date: Tue, 15 Aug 2006 02:56:47 -0700

On Aug 14, 2006, at 2:57 PM, Joseph Oreste Bruni wrote:

Hello all,
What is the purpose of the P_WEXIT flag? The <sys/proc.h> simply says "Working on exit."

I have a process that keeps getting stuck in the "E" state (as shown by the "ps" command) and I can't figure out what it happening here. My process is being started by launchd and is supposed to be kept running even if it fails. However, when it gets stuck in the "E" state, launchd never sees the process terminate and so won't start another instance. The process is multi-threaded if that has any relevance.

The P_WEXIT is set when a process has explicitly called exit(), or has had exit1() called on it, e.g. as a result of taking a fatal signal (either a SIGKILL, or another signal whose default behaviour is to terminate the process), or as a result of proc_shutdown(), which is called on a reboot. It can also be called on a process that has protected itself from being traced, if you attempt to attach a trace to it after it has made itself immune from tracing.

The process remains in this state until all active threads have drained out of the process, and the thread that called the exit() (or just the last thread, if exit1() was called on behalf of someone else, or the process was signalled) drains out to the user/kernel boundary, at which point, if the parent process is not ignoring SIGCHLD, then a zombie structure is allocated, and the contents filled un, and it hangs around until the parent process reaps it by calling one of the wait() functions (e.g. wait4()). If the parent process is ignoring SIGCHLD, then the process does not create a zombie, and is immediately reaped.

When you have this situation happening to one of your programs, usually it's because you have an uninterruptible thread in the process which is unable to drain out (maybe it's blocked on a network resource, or maybe it's in a blocking call that can't be interrupted, or maybe it's stuck in a device driver for a device you've powered down or unplugged and the driver didn't notice and drain all pending requests out automatically because it has a bug or isn't well behaved, etc.).

For multithreaded programs, it's generally best to have a clean shutdown routine for each of the threads, and shut the process down in an orderly fashion, rather than simply calling exit() (or taking a SIGKILL or other fatal signal) and terminating things abnormally. The normal way to deal with this is a pthread_kill(), with a pthread_exit () in the exit handler, and a pthread_join() in the main thread that was just calling exit().

If there is a thread stuck in a driver or other uninterruptible context, there's not really a lot you can do, other than "don't do that"/"download a newer driver that doesn't have the bug"/etc..

You may want to ktrace the process and watch it in this situation (assuming it hasn't disabled tracing on itself; you can always recompile without that line and trace it anyway, to find out what's going on). This will give you some idea of where it's stuck. You can also use various "ps" options to get more info (for example, a "-M" will dump out thread information for individual threads, so if the options you choose include the "STAT" column, if you see a "U" in that column, it means you are in an uninterruptible wait on that thread).

-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: P_WEXIT
From: Joseph Oreste Bruni <email@hidden>


References:  
  >P_WEXIT (From: Joseph Oreste Bruni <email@hidden>)




Prev by Date:
Re: mdsl source & another metadata question

Next by Date:
Re: P_WEXIT

Previous by thread:
P_WEXIT

Next by thread:
Re: P_WEXIT

Index(es):

Date
Thread