site_archiver@lists.apple.com Delivered-To: Darwin-kernel@lists.apple.com User-agent: Alpine 1.00 (DEB 882 2007-12-20) Hi Terry, On Tue, 2 Mar 2010, Terry Lambert wrote: (A) Have to ask for a very short sleep You do not have access to the Apple Bug Reporter (aka RadarWeb). Please contact the IS&T HelpLine. Dial 4-7777 from any Cupertino campus phone. Dial 7777 from any Sacramento campus phone. Dial 4-HELP (4357) from any Austin campus phone. Dial 1-800-800-6272 from anywhere in the U.S. Dial 1-408-974-7777 from SCV or outside the U.S. Cheers, Chris. -- _ ___ __ _ / __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK | / (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer | \ _/_/_/_//_/___/ | We are GNU-free your mind-and your software | _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... Any ideas why I'm seeing (1<<32)-1 in tr.tv_sec after the call, when the call finishes late? This is the actual problem that I'm having. The base problem causing the issue is the signal handler taking a very long time to run in the test code. Specifically, the idea of the 'remain' in the libc function in the nanosleep() implementation is a result of a call to clock_get_time(), which occurs after the SEMWAIT_SIGNAL(). If an EINTR happens there as the result of the signal, the EINTR will not return from the semwait until after the signal trampoline has been run, which will take however long the signal trampoline takes to run. Since the current time used to calculate the remainder time is polled non-atomically with regard to the semwait, using the separate clock_get_time, the remaining delta can be off by the amount of time that it took to run the handler. If this number is larger than the initial timespec time request, then you can "go backwards". So as your code is currently written, you: (B) Have a registered signal handler for the interrupting signal which takes longer than the (remaining) very short sleep to run. Consequently, the SUB_MACH_TIMESPEC() subtracts 1 from the second as a borrow for the tv_nsec, and underflows. I can't see anything in the spec that forbids tv_sec from having a negative value, and its type in Darwin is long, which is signed. So it's not necessarily a bug per se that it would contain a negative value in this case. The bug is that it does not; it contains (1<<32)-1 which is a very large positive value. The usual suggestion for (A) is to not loop so tightly, or, if you are doing the work in another thread than the one you are observing the timing on, by injecting a thread via the test harness, then block all signals on it. I don't think I am using threads where this bug is triggered, but I do have a signal handler installed (which just sets a flag and does nothing else, so should not qualify for B). And the minimum sleep supported by my safe_sleep function is 1 second. But if the nanosleep call returned just before expiry, with a very small remaining time, then I would sleep again for a very short time. Did you manage to reproduce the issue in the end, or do I need to continue work on a better, self-contained test case using nanosleep and signals (not just the integer overflow demonstration)? For now, I would suggest you look at the BAD_MACH_TIMESPEC() macro, and post-test the remainder from the EINTR. IIf it comes up bad, then it means that you've actually spent more time than requested in the nanosleep() plus the signal handler(s), and you can just pretend it returned normally... I'd prefer not to do anything non-portable here, so I think I might just call gettimeofday() instead if that's OK. Meanwhile, you should file a bug, and include this conversation as part of your description. The component it needs to be filed against is Libc. It's not clear to me where or how to file a bug with Apple, as I'm not a customer and I don't have a support contract. People are lending me remote access to their machines to test the software on and fix bugs, so I don't have the serial numbers of their machines. I tried to log into bugreport.apple.com with my apple ID, but I get this message: This email sent to site_archiver@lists.apple.com