Re: Kernel panics on OSX 10.3.9 on multiple machines in Win2K network (third attempt)
Re: Kernel panics on OSX 10.3.9 on multiple machines in Win2K network (third attempt)
- Subject: Re: Kernel panics on OSX 10.3.9 on multiple machines in Win2K network (third attempt)
- From: Terry Lambert <email@hidden>
- Date: Tue, 28 Mar 2006 15:30:47 -0800
On Mar 28, 2006, at 1:52 AM, Ochal Christophe wrote:
On Mon, 2006-03-27 at 18:34 -0800, Terry Lambert wrote:
From the looks of things, at least one of the machines (looks like
the laptop) has bad memory.
The machine has been tested at our office, no reproducable crashes
occured, even under heavy systemload, hardware test also didn't pick
up
on any ram issue's.
This is potentially a thermal or power issue, where your test
environment differs from the deployment environment. You can ignore
this part of the response, if the machine does not have mismatched
memory speed, and it's not third party. It was only one of the
failures that looked like that.
The weird thing is that all 4 machines started experiancing the same
kernel panics on the same day, the only known change i know of, is
that
the SAN's connected to the Win2K server have been moved to a different
location (fysical move), and their network manager insists nothing
else
was changed, nor were there any updates applied to the server. Since
this also happened the 8th of March (the day the kernel panics started
occurring on all four macs) leads me to think that it's really a
software issue/glitch somewhere, i'm in the dark however, when it
comes
to solving these.
Then whatever's in the middle that wasn't, or isn't in the middle that
was, is likely the culprit. It's also possible that there was data
corruption on the server itself as a result of physical shock during
the mode, and that's now being passed on to the client machines, which
are barfing on it. You didn't give the number of machines having the
smbfs KEXT in the panic traceback, but if it's more than onem, then
this is likely the case. In general clients try to be immune to the
server sending them bad things, but you can't always catch everything.
The other machine looks to be running a third party KEXT, which is
leaking memory in one of the zones (my guess would be AntiVirus
software, since in the past they tried to replace system call entry
points with their own code drived from OpenDarwin; if we made a
change
in one of those routines in a point release, they inevitably lead to
memory leaks/panics because of the stale code not *exactly* matching
the update version).
I'd have to check the machine, but it was delivered without any AV
software to the costumer, i'll see what i can find out
The only way would be two machine debugging, and asking it what's
loaded where, so you could resolve a couple of those addresses. Even
then, you'd have to get a controlled reproduction of the problem
(don't know how hard that would be).
It also looks like someone has tuned some of the administrative
limits
on the number of open fd's up on one of the machines, far past the
amount of memory available for such things (a couple of the panics
are
NULL pointer dererferences following an allocation failure in the
M_FILE zoen, which would not occur unless the administrative limits
on
the machine had been explicitly changed).
Where would one change such a setting?
Typically, in the server administration settings on the machine. The
normal place this is set for SMB in a Samba server, for example, is in
its configuration file, or via a GUI configuration tool. The specific
limit you are looking for is the hard limit in the setrlimit in the
value of RLIMIT_NOFILE, which could be raised as high as 10240, but
that number is generally excessive unless you have a large amount of
memory in the machine to enable to handle that many open files, and
the level of client load (number of smbd processes) that that would
entail. The typical failure mode is to try to set this to
"unlimited", which is usually one of the allowable options.
There are a number of resources available on the web and on
developer.apple.com that deal with MacOS X server tuning (e.g. put the
last four words there into google and look around a bit).
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden