On Oct 14, 2005, at 2:11 PM, Merle McClelland wrote:
On MacOS 10.4.2, a Startup Items daemon process that I am writing
for a client occasionally shows up in Activity Monitor tagged with
"not responding". I believe this also occurred on MacOS 10.3.9. The
details for the process show one "recent hang". However, while
Activity Monitor shows "not responding", the daemon is executing as
expected, and is not hung. Exiting Activity Monitor and restarting
it removes the "not responding" tag, so it looks like Activity
Monitor saw what it considered a hang at some point and even though
the process responded after that point, Activity Monitor did not
clear the "not responding" status.
Looking at the process using top and ps I see nothing out of the
ordinary. Client applications that talk to this daemon via TCP/IP
continue to function and report no problems, and the daemon itself,
which is monitoring devices on the network using TCP/IP, continues
to monitor them as expected. CPU usage is low (2 to 3%), there
doesn't appear to be any memory leaks, and the daemon process
responds to SIGTERM and SIGHUP signals which it uses for graceful
shutdown and restart.
The daemon is using some of my client's cross-platform libraries
that are based on ZThread-2.3.2, with a Carbon-based wrapper. The
wrapper invokes the low-level libraries, and also monitors system
power states using the IOManager and handles signals via standard
signal handlers. Depending on the number of network devices being
monitored by the daemon there can be 10 to 15 threads executing in
the process. The process also uses syslog every two seconds or so
for debugging messages.
Cocoa-based GUI client applications that talk to this daemon are
also using these cross-platform libraries and multiple ZThreads,
but have never been tagged in Activity Monitor as "not responding".
I've searched the mailing list archives and the Internet in general
for references to Activity Monitor and "not responding" and "recent
hangs", and have also searched man pages for some clue as to what
is going on. I haven't found a definition of what specific process
event or state is reflected by these labels in Activity Monitor.
My question is: what are Activity Monitor's definitions of "not
responding" and "recent hangs"?
"not responding" means that the application is currently "hung".
"recent hangs" is the count of times the application was not
responding, but then became responsive again.
ActivityMonitor sends "ping" AppleEvents to processes to determine if
they are hung. If they respond to the event, then they are not
hung. If they fail to respond to the event, then they are considered
to be in a "Not Responding" state, and for each transition from
"Responding" to "Not Responding" that it sees counts as a "hang" of
the application.
Normally, an application will be written in such a way that it will
at least *periodically* check for AppleEvents, even if it tends to
call from an event from CFRunLoop into a long involved process. You
can get false positives on a "Not Responding" (the same situation
that gets the busy cursor on an application in the GUI, FWIW) if an
application does a huge amount of work in something called from
CFRunLoop.
The normal way to avoid this, if you are actually not hung, is to
either periodically save the state of your work in progress and
restart the work by sending yourself an event, or dispatch the work
to a thread, instead of performing it directly as the result of some
event.
If you take the second approach (which is the best), you have to deal
with the fact that you could get "false negatives", i.e. your
application looks like it's still alive when it's actually dead.
To avoid false negatives, you need to provide a barrier interlock in
your CFRunloop; something like a semaphore or a mutex you take and
drop in the loop will work.
When you dispatch the long duration task to a thread to work on,
*that* thread should immediately take the mutex, effectively
"hanging" the application. At intervals (best handled via an
interval timer or using a work counter), the thread drops and
reacquires the mutex, and when work is complete, it drops the mutex
and goes idle (or exits).
By doing this, you guarantee that if the thread actually becomes
hung, then the CFRunLoop will also become hung because the thread
holds the mutex it needs in order to complete processing the last
even, and process the next one (which might be a ping from Activity
Monitor, or the window server, or some other monitoring process).
That way you only end up with a busy cursor or a Not Responding
status (and an increment of the recent hangs count) if your
application actually becomes hung up.
For more complicated applications, where you might have a lot of work
going on in parallel, you can establish a monitor thread and indicate
status to it. The monitor thread would have a common status area for
each outstanding work item in progress (e.g. a counter in an array of
volatile ints as one example), and wake up periodically (e.g. via use
of an interval timer or some other mechanism). The monitor thread
would take the mutex instead of the work thread, and periodically
wake up. It would compare the current int counter with a copy of the
int counter for each outstanding work item, and if they were the
same, that work item is hung.
At that point, it can decide to take some action, or it can decide
that it's not going to drop the mutex this time around. In general,
you probably only want to decide not to drop the mutex if you only
have one work thread outstanding and it's hung; even then, it's more
likely you'll want to put up a dialog box offering the user some way
to correct (or not) the errant thread that's not making progress on
its work item.
Hope that helps!
-- Terry
Thanks,
Merle F. McClelland
Encad, Inc
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Unix-porting mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/unix-porting/tlambert%
40apple.com
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Unix-porting mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/unix-porting/email@hidden