site_archiver@lists.apple.com Delivered-To: Darwin-kernel@lists.apple.com Hi all, The code in question does this: void safe_sleep(int seconds) { struct timespec ts, tr; memset(&ts, 0, sizeof(ts)); ts.tv_sec = seconds; ts.tv_nsec = 0; while (nanosleep(&ts, &tr) == -1 && errno == EINTR) { BOX_TRACE("nanosleep interrupted with " << ts.tv_sec << "." << ts.tv_nsec << " secs remaining, sleeping again"); -- Terry _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... On Feb 28, 2010, at 11:48 AM, Chris Wilson wrote: I'm developing open source software that runs on MacOS among other platforms. I recently discovered a problem where code that works fine on other platforms is hanging indefinitely on OSX 10.6.2. I'm sure it worked in 10.3 on PPC hardware. if (ts.tv_sec >= seconds) { BOX_WARNING("nanosleep returned with junk in " << "struct: " << ts.tv_sec << "." << ts.tv_nsec); return; } ts = tr; /* sleep again */ } } You need to be looking at tr, not ts, or you need to do the structure assign immediately after the EINTR, before your trace statements. The tr contains the remainder time, the ts structure contents are irrelevant after the nanosleep() call. You're BOX_ macros appear to indicate you wanted the remainder time. This looks suspiciously like code one would use to implement a polling loop. That's generally a mistake. I think if you are getting signals often enough for this to be an issue, you'd be better off passing the same address structure in for both parameters and specifying SA_RESTART in the flags field. Even then, this is probably a bad use of signals, since multiple signals being sent won't necessarily result in multiple notifications. Signals are defined as persistent conditions, not events, which means, for example, if you had multiple child processes die in a narrow time window, the last one to happen will be overwriting the siginfo information (assuming you set the SA_SIGINFO flag in the sa_flags field, and used sigaction() rather than signal() so that you are getting siginfo in the first place). This is typically why things that intend to reap child process exit status loop calling waitpid(-1, &statusvar, WNOHANG) until it returns -1 with an errno of ECHILD. If this isn't the specific case, then you'd be a lot better off avoiding signals, and using a reliable IPC mechanism instead. This email sent to site_archiver@lists.apple.com
participants (1)
-
Terry Lambert