Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bad uptime on lightly loaded server



We've been having persistent uptime problems with one of our servers. Last uptime was 123 days.

I think automount crashing caused it:
Jan 28 10:27:05 academic crashdump[9768]: automount crashedJan 28 10:27:05 academic crashdump[9768]: crash report written to: /Library/ Logs/CrashReporter/automount.crash.log
Jan 28 10:27:38 academic kernel[0]: nfs server automount -nsl [186]: not respondingJan 28 10:27:38 academic KernelEventAgent[35]: tid 00000000 received VQ_NOTRESP event (1)
Jan 28 10:27:38 academic KernelEventAgent[35]: tid 00000000 type 'nfs', mounted on '/Network', from 'automount -nsl [186]', not responding
Jan 28 10:27:38 academic KernelEventAgent[35]: tid 00000000 found 1 filesystem(s) with problem(s)



Here's the info on the server:
It's a dual 2.3GHz G5 Xserve with 2GB of ram. It boots off of a 80Gb SW raid one, and storage is a 1.5GB raid5 on an xserve raid. It generally has mounted 2 fw drives for boot backups as well as an NFS* mount for storage backup.


When it crashes, the first thing that we normally hear is that students are no longer able to log in because AFP is dead. Today, the first thing that was reported was that user web shares were broken. Apache reported that it did not have search access along its path. (which, it did)

It was running 10.4.6.  I took this opportunity to update to 10.4.8

There are 1157 users with home directories on the box. Normal use has about 50. Peak usage tends to be around 150. sendmail is run on this machine, but only to send error messages, user mail is handled on a different box.

Stock apache is also run, but 12R/s is about the heaviest load that it ever sees.

Automount crashed with the header of:
**********

Host Name:      academic
Date/Time:      2007-01-28 10:27:04.824 -0600
OS Version:     10.4.6 (Build 8I127)
Report Version: 4

Command: automount
Path:    /usr/sbin/automount
Parent:  launchd [1]

Version: ??? (???)

PID:    186
Thread: 5

Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_PROTECTION_FAILURE (0x0002) at 0x00000000
***********
Which, to my untrained eye looks like a function that should have returned an address returned an error code that wasn't ever checked.


console had some errors that I should probably look into, but are likely a side effect:
2007-01-28 14:25:13.745 SyndicationAgent[4924] WARNING: BestCalendarDateFromString - can't interpret: 'Sun 28 Jan 2007 11:42:46 -800


AFP shows no errors since September

Anyone have any ideas as to how we can make this box more stable? I do not like that core daemons will just crash without warning. For one thing, they make me come to work on Sunday.


*Please, no flames. It is on a private VLAN. I may be stupid, but I'm not that stupid. Take your holy wars elsewhere.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Macos-x-server mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/macos-x-server/email@hidden


This email sent to email@hidden


Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.