Lists

Open Menu Close Menu

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Sleeping in nanos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sleeping in nanos

Subject: Re: Sleeping in nanos
From: Terry Lambert <email@hidden>
Date: Thu, 8 Feb 2007 19:52:59 -0800

On Feb 8, 2007, at 9:11 AM, Greg wrote:

On Feb 8, 2007, at 12:03 PM, Graham J Lee wrote:
I think there's a missed point here; even if the above were false you still wouldn't get the results I think you expect to get. Even if nanosleep() did truly sleep your thread for exactly one nanosecond, the time elapsed between the start of the sleep and the scheduler running your thread again is down to the scheduler, has nothing to do with your thread, and is pretty much guaranteed to be equal to or (more likely) greater than the sleep time requested. As it is, due to hardware limitations it currently takes *much* longer to reawaken your thread than O(ns), so even a perfectly accurate implementation of nanosleep() sucks royally ;-)

Thanks for pointing that out. One has to wonder then, why provide such a function to developers and claim that it *can* possibly sleep for the amount of time requested? Like I said before, this function is mainly useful in *time-critical applications*, and in such applications, this unpredictable function is absolutely useless.

Everyone should think of all UNIX sleep or timer/alarm functions that take interval parameters as being:

	"sleep/wait/fire-after _at least_ this long"

Short of hard real time, there is no way to guarantee that when your process passes this interval and becomes ready to run that it will be the first thing scheduled onto a CPU.

Even with hard RT, it's a sad fact that all software vendors believe that THEIR software is the most important thing a computer can possibly do, so if you have two such vendors code running on the same machine, you are competing for who is most important, and if they both "really mean it", then they will both be at the highest RT priority possible, and you will round-robin them of the head of the scheduling queue.

Even if you're alone, and it's an RT thread (i.e. you went through all the normal contortions to indicate this to the OS), there's no guarantee that a top end interrupt thread, or a system thread, such as a system timer event for the 2MSL socket retransmit clock, or the NETISR for inbound TCP packet processing, won't run before a user thread. Top end interrupt handling threads are system RT threads with higher priority than user space RT threads, since you can run at as high a user priority as you want doing work each [interval shorter than a system call latency] while waiting for a disk I/O, and if your thread were higher priority, the top end of the disk driver would never run.

NB: This is called a "starvation deadlock".

Likewise, the bottom end interrupt handlers run as a result of hardware interrupts, some of which are non-maskable due to the hardware design -- e.g. the HPET interrupt on Intel CPUs that fires from the bridge chipset to trigger power management code, or the SMI (System Management Interrupt) that fires to handle SMM events for ACPI on PCs, etc.. If these interrupts fire at a high enough frequency, likewise, your process may be scheduled to run, but the CPU may spend all its time in interrupt handling, preempting you ever getting to run.

NB: This is called "receiver livelock".

For all these reasons (specifically, because the OS _must_ do things to avoid the possible catastrophes describe above, and others I haven't described), you can only say "wait at least this long", at least so long as you are running in user space, or even if you are running in kernel space, unless you are running at interrupt level, and your hardware interrupt never gets masked.

Now this _can_ still be very useful for writing most timing sensitive programs - for example, if you had a user space USB printer or camera driver living on top of the kernel USB device driver, and if you talked to the device too fast, it started dropping data because its buffers weren't very deep, and the USB chipset was faster than the processor in the device, so you could fill it up faster than the printer or camera could empty the buffer on their chipset. Most tiiming sensitive problems fall into the "I can only do it when I have my ducks in a row" category, where your delay is intentional, because you are waiting for the ducks to line up.

In the above cases, a delay of _at least_ enough time for the hardware to empty the queue on the USB chipset before you sent more data across the USB bus would let you write a working driver instead of a broken driver.

For something like a professional digital video console over firewire operating live instead of recording, which horks of a hairball if you drop a single frame, and therefore can only tolerate a single retransmit of a FW packet over the interface before your HD Video stream is considered corrupt, things are different. For this sort of application, you need a kernel driver, and you (probably) need to disable all the other stuff you are running (e.g. dashboard widgets, menubar clock, Spotlight indexing, cron jobs, etc., etc.) that might take CPU cycles away from it 1:45 into a 2 hour movie edit.

If your application is this sensitive to latency - i.e. you can't live with "at least this long", and you need "at least this long, but no longer than this long + n" - you need to consider writing a KEXT, and you need to think about how you are going to avoid getting your interrupt masked, etc..

The general solution for such a tightly constrained problem is to either buy hardware that that's overkill for the problem most of the time, and sufficient for the problem in the rare case where everything else goes wrong at once, so that it still works -- or to buy a purpose- built hardware device.

-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Sleeping in nanos (From: Greg <email@hidden>)
  >Fwd: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Ed Wynne <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Terry Lambert <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Terry Lambert <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Michael Smith <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Michael Smith <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)
  >Re: Sleeping in nanos (From: Graham J Lee <email@hidden>)
  >Re: Sleeping in nanos (From: Greg <email@hidden>)




Prev by Date:
Re: Debugging Proprietary KEXTs

Next by Date:
Re: IODemoryDescriptor->prepare() Performance relateted to the	kIODirection?

Previous by thread:
Re: Sleeping in nanos

Next by thread:
Re: Sleeping in nanos

Index(es):

Date
Thread