Re: Kernel panics on OSX 10.3.9 on multiple machines in Win2K network (third attempt)
site_archiver@lists.apple.com Delivered-To: darwin-kernel@lists.apple.com On Mar 28, 2006, at 1:52 AM, Ochal Christophe wrote: From the looks of things, at least one of the machines (looks like the laptop) has bad memory. I'd have to check the machine, but it was delivered without any AV software to the costumer, i'll see what i can find out -- Terry _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... On Mon, 2006-03-27 at 18:34 -0800, Terry Lambert wrote: The machine has been tested at our office, no reproducable crashes occured, even under heavy systemload, hardware test also didn't pick up on any ram issue's. This is potentially a thermal or power issue, where your test environment differs from the deployment environment. You can ignore this part of the response, if the machine does not have mismatched memory speed, and it's not third party. It was only one of the failures that looked like that. The weird thing is that all 4 machines started experiancing the same kernel panics on the same day, the only known change i know of, is that the SAN's connected to the Win2K server have been moved to a different location (fysical move), and their network manager insists nothing else was changed, nor were there any updates applied to the server. Since this also happened the 8th of March (the day the kernel panics started occurring on all four macs) leads me to think that it's really a software issue/glitch somewhere, i'm in the dark however, when it comes to solving these. Then whatever's in the middle that wasn't, or isn't in the middle that was, is likely the culprit. It's also possible that there was data corruption on the server itself as a result of physical shock during the mode, and that's now being passed on to the client machines, which are barfing on it. You didn't give the number of machines having the smbfs KEXT in the panic traceback, but if it's more than onem, then this is likely the case. In general clients try to be immune to the server sending them bad things, but you can't always catch everything. The other machine looks to be running a third party KEXT, which is leaking memory in one of the zones (my guess would be AntiVirus software, since in the past they tried to replace system call entry points with their own code drived from OpenDarwin; if we made a change in one of those routines in a point release, they inevitably lead to memory leaks/panics because of the stale code not *exactly* matching the update version). The only way would be two machine debugging, and asking it what's loaded where, so you could resolve a couple of those addresses. Even then, you'd have to get a controlled reproduction of the problem (don't know how hard that would be). It also looks like someone has tuned some of the administrative limits on the number of open fd's up on one of the machines, far past the amount of memory available for such things (a couple of the panics are NULL pointer dererferences following an allocation failure in the M_FILE zoen, which would not occur unless the administrative limits on the machine had been explicitly changed). Where would one change such a setting? Typically, in the server administration settings on the machine. The normal place this is set for SMB in a Samba server, for example, is in its configuration file, or via a GUI configuration tool. The specific limit you are looking for is the hard limit in the setrlimit in the value of RLIMIT_NOFILE, which could be raised as high as 10240, but that number is generally excessive unless you have a large amount of memory in the machine to enable to handle that many open files, and the level of client load (number of smbd processes) that that would entail. The typical failure mode is to try to set this to "unlimited", which is usually one of the allowable options. There are a number of resources available on the web and on developer.apple.com that deal with MacOS X server tuning (e.g. put the last four words there into google and look around a bit). This email sent to site_archiver@lists.apple.com
participants (1)
-
Terry Lambert