Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Activity Monitor - "recent hangs" and "not responding"
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Activity Monitor - "recent hangs" and "not responding"




On Oct 14, 2005, at 3:02 PM, Terry Lambert wrote:

On Oct 14, 2005, at 2:11 PM, Merle McClelland wrote:

On MacOS 10.4.2, a Startup Items daemon process that I am writing for a client occasionally shows up in Activity Monitor tagged with "not responding". I believe this also occurred on MacOS 10.3.9. The details for the process show one "recent hang". However, while Activity Monitor shows "not responding", the daemon is executing as expected, and is not hung. Exiting Activity Monitor and restarting it removes the "not responding" tag, so it looks like Activity Monitor saw what it considered a hang at some point and even though the process responded after that point, Activity Monitor did not clear the "not responding" status.

Looking at the process using top and ps I see nothing out of the ordinary. Client applications that talk to this daemon via TCP/IP continue to function and report no problems, and the daemon itself, which is monitoring devices on the network using TCP/IP, continues to monitor them as expected. CPU usage is low (2 to 3%), there doesn't appear to be any memory leaks, and the daemon process responds to SIGTERM and SIGHUP signals which it uses for graceful shutdown and restart.

The daemon is using some of my client's cross-platform libraries that are based on ZThread-2.3.2, with a Carbon-based wrapper. The wrapper invokes the low-level libraries, and also monitors system power states using the IOManager and handles signals via standard signal handlers. Depending on the number of network devices being monitored by the daemon there can be 10 to 15 threads executing in the process. The process also uses syslog every two seconds or so for debugging messages.

Cocoa-based GUI client applications that talk to this daemon are also using these cross-platform libraries and multiple ZThreads, but have never been tagged in Activity Monitor as "not responding".

I've searched the mailing list archives and the Internet in general for references to Activity Monitor and "not responding" and "recent hangs", and have also searched man pages for some clue as to what is going on. I haven't found a definition of what specific process event or state is reflected by these labels in Activity Monitor.

My question is: what are Activity Monitor's definitions of "not responding" and "recent hangs"?



"not responding" means that the application is currently "hung".

"recent hangs" is the count of times the application was not responding, but then became responsive again.


ActivityMonitor sends "ping" AppleEvents to processes to determine if they are hung. If they respond to the event, then they are not hung. If they fail to respond to the event, then they are considered to be in a "Not Responding" state, and for each transition from "Responding" to "Not Responding" that it sees counts as a "hang" of the application.



Normally, an application will be written in such a way that it will at least *periodically* check for AppleEvents, even if it tends to call from an event from CFRunLoop into a long involved process. You can get false positives on a "Not Responding" (the same situation that gets the busy cursor on an application in the GUI, FWIW) if an application does a huge amount of work in something called from CFRunLoop.


The normal way to avoid this, if you are actually not hung, is to either periodically save the state of your work in progress and restart the work by sending yourself an event, or dispatch the work to a thread, instead of performing it directly as the result of some event.

If you take the second approach (which is the best), you have to deal with the fact that you could get "false negatives", i.e. your application looks like it's still alive when it's actually dead.

To avoid false negatives, you need to provide a barrier interlock in your CFRunloop; something like a semaphore or a mutex you take and drop in the loop will work.

When you dispatch the long duration task to a thread to work on, *that* thread should immediately take the mutex, effectively "hanging" the application. At intervals (best handled via an interval timer or using a work counter), the thread drops and reacquires the mutex, and when work is complete, it drops the mutex and goes idle (or exits).

By doing this, you guarantee that if the thread actually becomes hung, then the CFRunLoop will also become hung because the thread holds the mutex it needs in order to complete processing the last even, and process the next one (which might be a ping from Activity Monitor, or the window server, or some other monitoring process).

That way you only end up with a busy cursor or a Not Responding status (and an increment of the recent hangs count) if your application actually becomes hung up.


For more complicated applications, where you might have a lot of work going on in parallel, you can establish a monitor thread and indicate status to it. The monitor thread would have a common status area for each outstanding work item in progress (e.g. a counter in an array of volatile ints as one example), and wake up periodically (e.g. via use of an interval timer or some other mechanism). The monitor thread would take the mutex instead of the work thread, and periodically wake up. It would compare the current int counter with a copy of the int counter for each outstanding work item, and if they were the same, that work item is hung.


At that point, it can decide to take some action, or it can decide that it's not going to drop the mutex this time around. In general, you probably only want to decide not to drop the mutex if you only have one work thread outstanding and it's hung; even then, it's more likely you'll want to put up a dialog box offering the user some way to correct (or not) the errant thread that's not making progress on its work item.

Hope that helps!

-- Terry


Thanks for the detailed response! That answers my questions. I wasn't aware of the AppleEvent pinging since my daemon isn't explicitly using AppleEvents.


Most of the work performed by the daemon is within separate threads invoked by the main process. Most of the time the main process is in the CFRunLoop, waiting for an event. However, periodically, the main process does do some work within the context of the timer and the IOManager callbacks. Based on your explanation, I suspect that it is this processing (which sometimes can cause the main process to wait for the other threads to do their stuff before exiting the callback) that is causing an occasional lack of response to the AppleEvent ping.

I'll have to evaluate moving these periodic functions to a background thread so that the main process doesn't get tied up performing the work. Sounds like I'll have to implement the interlock as you suggest since in this design the main process won't be doing much of anything. Given that my process is a daemon without a GUI (and is designed to run even when users aren't logged in), I can't directly display error dialogs. The cross-platform code using ZThread has a monitoring capability to handle hung worker threads. However, it appears I need to have one additional thread layer between the main process and the lower-level threads to off-load the main process from potentially long-term work that may occur periodically.

Is there a good example of writing a MacOS daemon that properly handles these scenarios? The "MyFirstDaemon" sample code doesn't cover these issues.

Thanks again,

- Merle

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Unix-porting mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Activity Monitor - "recent hangs" and "not responding" (From: Merle McClelland <email@hidden>)
 >Re: Activity Monitor - "recent hangs" and "not responding" (From: Terry Lambert <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.