site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com Giuliano _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... Hello again, I am aware that it has already been discussed in the past the possibility to have the MacOSX Server watchdog running on MacOSX to recover from hangs, but the discussion did not really get very far. In 10.4 this task is performed by wdticklerd. It does not appear that wdticklerd is in the open source tree. I have written in the past week about a disk problem that is causing hangs on a server of mine, but one thing I noticed that gave me the idea to write something that could restart the machine when all processes are waiting for the disk subsystem. I have written a small IOService KEXT (com_humph_kext_watchdog) with a IOUserClient to interface it to a userland tool. The KEXT can start and set a timer that when fired will reboot the machine (todo) and the userland tool, watchdog_ctl, has a loop that will resets the timer to a value a minute longer than the loop itself. Before resetting the timer though, it writes out a log line, so if stops there waiting, the timer in the kernel will fire. It remains to be seen what to call that will restart the machine without attempting to write to disk at all. Will reboot(RB_AUTOBOOT| RB_NOSYNC) be enough? If you would like to look at the code it is on a repository. The url to the subversion repository is it.humph.com/versions/public/ (I omitted the http to avoid bots..) user and pass are sguest and sub (just read privs, if I did not do anything wrong...) This email sent to site_archiver@lists.apple.com