site_archiver@lists.apple.com Delivered-To: darwin-kernel@lists.apple.com In a previous email, written 15 August 2006 by Terry Lambert: Frustrated, Joseph PS: Best Wishes for the Holidays everyone! _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... This email sent to site_archiver@lists.apple.com The P_WEXIT is set when a process has explicitly called exit(), or has had exit1() called on it, e.g. as a result of taking a fatal signal (either a SIGKILL, or another signal whose default behaviour is to terminate the process), or as a result of proc_shutdown(), which is called on a reboot. It can also be called on a process that has protected itself from being traced, if you attempt to attach a trace to it after it has made itself immune from tracing. The process remains in this state until all active threads have drained out of the process, and the thread that called the exit() (or just the last thread, if exit1() was called on behalf of someone else, or the process was signalled) drains out to the user/ kernel boundary, at which point, if the parent process is not ignoring SIGCHLD, then a zombie structure is allocated, and the contents filled un, and it hangs around until the parent process reaps it by calling one of the wait() functions (e.g. wait4()). If the parent process is ignoring SIGCHLD, then the process does not create a zombie, and is immediately reaped. I've been reading and re-reading this to try to understand why my process is getting stuck. As I might have pointed out in my previous email, by the time my main thread calls exit() (implicitly by returning from main()), there are NO other threads besides the main thread since the main thread has joined with all other threads. Is it possible that the kernel thinks there are still some threads remaining? Could it be related to the fact that my threads are exiting via pthread_testcancel() rather than by pthread_exit()? I know that thread cancelation on Darwin is a bit broken by default without specifying POSIX code via the DARWIN_ALIAS macro. Could I have stumbled onto some part of the kernel that just hasn't received a lot of testing? On my test system, I can start and kill my process all day long and never reproduce this. In production, where the system runs for weeks, the condition occurs... smime.p7s
participants (1)
-
Joseph Oreste Bruni