Re: Possible bug with nanosleep()?

2 Mar 2010

      site_archiver@lists.apple.com
Delivered-To: Darwin-kernel@lists.apple.com

Hi Terry and all,

[ ... code elided ... ]
	int32_t secs = (int32_t) ts.tv_sec;

Very interesting issue.  8-).
(A)	Have to ask for a very short sleep
Hope that helps,
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (Darwin-kernel@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a...
On Mar 2, 2010, at 12:26 AM, Chris Wilson wrote:

On Mon, 1 Mar 2010, Terry Lambert wrote:

You need to be looking at tr, not ts, or you need to do the
structure assign immediately after the EINTR, before your trace
statements.  The tr contains the remainder time, the ts structure
contents are irrelevant after the nanosleep() call.  You're BOX_
macros appear to indicate you wanted the remainder time.

Sorry, my mistake, my code does that but I edited it somewhat while
writing the email and forgot to update that part of the sample that
I pasted.
For reference, this is the code I have in the application now:

And if I change both int32_t to long (or __darwin_time_t, the type
used in the structure) on this line:
then it hangs forever on OSX.

OK, there is a possible scenario which will cause it to drift, and my
modification of your original test function to put it into a working
test harness wouldn't have (didn't) trigger it.  I've modified my
initial test harness to cause the problem to reproduce (moral: always
include fully working example code when noting a problem to a mailing
list).
The comment in the Libc code about the layout isn't strictly correct.
Specifically, the ADD_MACH_TIMESPEC() and SUB_MACH_TIMESPEC() are
macros, and the marcros only really care about having a corresponding
field name for tv_nsec and tv_usec in the structures, so the
difference in element size in the structure isn't going to matter, at
least until we hit Y2038.  By that time I expect that the
mach_timespec_t will be using a 64 bit tv_sec value (unless you are
running your clock forward to do Y2038 compliance testing, in which
case, nanosleep() isn't currently Y2038 compliant because
clock_get_time() isn't Y2038 compliant, even when doing 64 bit
programs).
Actually I originally wrote the code to pass the same structure for
both, and was wondering whether that was allowed (it's not
documented as supported or unsupported) so I wanted to separate the
two to make sure that people wouldn't claim that I was using the
function inappropriately and thus ignore the rest of my message.
Even then, this is probably a bad use of signals, since multiple
signals being sent won't necessarily result in multiple
notifications.

I'm not interested at all in signals in this code, I wish they
wouldn't happen, I just want to sleep for the appointed time and
nothing else.
Any ideas why I'm seeing (1<<32)-1 in tr.tv_sec after the call, when
the call finishes late? This is the actual problem that I'm having.

The base problem causing the issue is the signal handler taking a very
long time to run in the test code.
Specifically, the idea of the 'remain' in the libc function in the
nanosleep() implementation is a result of a call to clock_get_time(),
which occurs after the SEMWAIT_SIGNAL().  If an EINTR happens there as
the result of the signal, the EINTR will not return from the semwait
until after the signal trampoline has been run, which will take
however long the signal trampoline takes to run.
Since the current time used to calculate the remainder time is polled
non-atomically with regard to the semwait, using the separate
clock_get_time, the remaining delta can be off by the amount of time
that it took to run the handler.  If this number is larger than the
initial timespec time request, then you can "go backwards".  So as
your code is currently written, you:
(B)	Have a registered signal handler for the interrupting signal which
takes longer than the (remaining) very short sleep to run.
Consequently, the SUB_MACH_TIMESPEC() subtracts 1 from the second as a
borrow for the tv_nsec, and underflows.
The usual suggestion for (A) is to not loop so tightly, or, if you are
doing the work in another thread than the one you are observing the
timing on, by injecting a thread via the test harness, then block all
signals on it.  Asking for a really tight loop, though, would still
leave you racing up to the expiration time because of (B).  The usual
suggestion for (B) is to only set a volatile variable in any signal
handler, and then examine it in the main thread, instead of doing the
work on the signal handler.  It would be a very tiny window then, but
given the code, you'd still have one, since you're not going to hit
the trampoline in 0 instructions.  POSIX specifically states "The
suspension time may be longer than requested" for resolution and
scheduling reasons, so that's not entriely incorrrect.
For now, I would suggest you look at the BAD_MACH_TIMESPEC() macro,
and post-test the remainder from the EINTR.  IIf it comes up bad, then
it means that you've actually spent more time than requested in the
nanosleep() plus the signal handler(s), and you can just pretend it
returned normally, without the EINTR.  I'd caution you that if a
signal handler is taking long enough to run that it's causing you to
see this, then it's probably also throwing off any elapsed time you
are measuring.  If all you are doing is delaying "for at least this
long", then that's not an issue for you.
Meanwhile, you should file a bug, and include this conversation as
part of your description.  The component it needs to be filed against
is Libc.
This email sent to site_archiver@lists.apple.com

Terry Lambert

tags

participants (1)