Re: waiting queue
Re: waiting queue
- Subject: Re: waiting queue
- From: Quinn <email@hidden>
- Date: Wed, 6 Dec 2006 09:37:01 +0000
At 11:11 -0800 5/12/06, Michael Smith wrote:
As a general rule, if you believe that your hardware is reliable,
use THREAD_UNINT to be safe.
There's one gotcha here. Because of the way user clients are
implementation, you shouldn't use THREAD_UNINT while blocking in a
user client if that user client has wired down any memory from the
client process. If you do so, there will be no way to kill the task.
I've included the gory details below. This was on 10.3.x, so YMMV on
later systems.
S+E
--
Quinn "The Eskimo!" <http://www.apple.com/developer/>
Apple Developer Relations, Developer Technical Support, Core OS/Hardware
If you look at task_terminate_internal, you'll see that midway
through its implementation it calls thread_terminate_internal for
each thread in the task. This doesn't actually terminate the
threads. Rather, it schedules an AST for the thread. The AST is
like a secondary interrupt. The next time the thread leaves the
kernel, it will run the AST handler before it leaves. This includes
two special cases:
o If the thread is currently running on another CPU, it will send a
inter-CPU interrupt to force the AST to happen promptly.
o If the thread is blocked, it will be woken up so that it can leave
the kernel and run the AST.
Remember that the thread executing task_terminate_internal could be
part of the task, and thus is subject to this AST scheduling.
However, because that thread is still running inside the kernel,
scheduling an AST on the thread has no immediate impact. The AST
only runs as the thread leaves the kernel.
task_terminate_internal continues to run and eventually calls
ipc_space_destroy and vm_map_remove. Ultimately
task_terminate_internal returns and the thread winds its way out of
the kernel. As it leaves the kernel the thread executes any pending
ASTs. If this thread is part of the terminated task, it will
execute the AST that was scheduled when task_terminate_internal
called thread_terminate_internal.
The AST handler for task termination does the key work via a call to
thread_terminate_self (osfmk/kern/thread.c). So, as
task_terminate_internal has been doing its thread, other threads in
the task have been running, entering the kernel, and executing the
thread_terminate_self routine as they leave.
thread_terminate_self decrements the count of the number of running
threads in the task and, if it hits 0, calls BSD (the proc_exit
routine) to clean up BSD constructs associated with the task. It
then goes on to clean up various aspects of the thread, mark the
thread as terminated (by setting TH_TERMINATE), and then blocks the
thread (by calling thread_block). When the thread switcher switches
away from a terminated thread, it schedules the thread to be reaped
(thread_reaper_enqueue) which, in turn, wakes up the reaper thread
(reaper_thread_continue) which finally disposes of the thread's data
structures.
Given the above background, it's now possible to show how your code
gets into trouble, and why moving away from THREAD_UNINT fixes it.
Here's the sequence of events when you use THREAD_UNINT.
1. The blocking thread (thread A) enters your user client. As it
enters the kernel it increments the number of send rights for the
user client's port.
2. Thread A now blocks uninterruptibly using IOCommandGate::commandSleep.
3. Much time passes.
4. One of other the threads in the task (thread B) decides to that
it's time to die (it's the target of the force quit signal, or it
calls "exit", or it bus errors, or whatever). This results in a
call to task_terminate_internal.
5. task_terminate_internal calls thread_terminate_internal for each
thread in the task, which schedules an AST for those threads,
including threads A and B. Thread A is blocked uninterruptably, and
so it does nothing in response to this AST request. Thread B is
still inside the kernel (running task_terminate_internal itself) and
so does not respond immediately to the AST.
6. Thread B, still running task_terminate_internal, now calls
ipc_space_destroy. This destroys all of the rights for the task.
However, the extra send right added in step 1 prevents your user
client's port's send right count going to 0, thus a "no more
senders" notification isn't given, thus your ::clientDied method is
not called.
7. Thread B now calls vm_map_remove. This ends up blocking because
one of the VM map entries is wired by your driver.
The task is now deadlocked. Thread B is blocked waiting for your
driver to unwire the memory. However, that won't happen until
::clientDied is called. However, that can't happen because thread A
is holding a send right for your user client's port, and thread A is
blocked uninterruptibly.
Now let's look at what happens if thread A blocks interruptibly.
Everything proceeds as above until step 5. At that point things
diverge with the following sequence.
5. task_terminate_internal calls thread_terminate_internal for each
thread in the task, which schedules an AST for those threads,
including threads A and B. As before, thread B is still inside the
kernel (running task_terminate_internal itself) and so does not
respond immediately to the AST. However, thread A is blocked in
IOCommandGate::commandSleep and the AST causes it to unblock. It
will eventually return from IOCommandGate::commandSleep with a
THREAD_INTERRUPTED error.
6. As above.
7. As above (thread B blocks in vm_map_remove).
8. At this point thread A is scheduled. It returns from
IOCommandGate::commandSleep with a THREAD_INTERRUPTED error.
Eventually this causes the thread to leave the user client and wind
its way out of the kernel. As thread A runs the return path of
ipc_kobject_server it decrements the send right count for the user
client port.
9. Because the send right count now hits 0, a "no more senders"
notification is generated for that port, which calls your
::clientDied method to be called. It's called by thread A.
10. Your ::clientDied method shuts down the user client, which in
turn destroys the IOMemoryDescriptor. This unwires the VM map
entry, which unblocks thread B.
11. After executing your ::clientDied method thread A eventually
leaves the kernel and runs the AST which results in the thread being
cleaned up (thread_terminate_self).
12. Thread B is now scheduled and returns from vm_map_remove. It
continues to execute task_terminate_internal, which eventually
returns and thread B leaves the kernel.
13. Thread B now runs its AST and cleans itself up (thread_terminate_self).
As you can see, making thread A interruptible is the key point in
resolving the deadlock.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden