Re: Questions about debugging kernel panics
Re: Questions about debugging kernel panics
- Subject: Re: Questions about debugging kernel panics
- From: William Kucharski <email@hidden>
- Date: Tue, 17 Jan 2006 17:40:08 -0700
On Tuesday, January 17, 2006, at 03:39PM, James Reynolds <email@hidden> wrote:
>I want to figure out why the panic occurs, and hopefully how to stop
>it. I manage a lab of about 300 Macs and I see this panic often
>("this" meaning where the PC values and sometimes backtraces are the
>same). It happens maybe 10 times a month?... I haven't counted yet.
>I would like to stop it.
Definitely sounds like a bug somewhere to me. I presume the problem hasn't gone
away with 10.4.4, or at least you haven't been running it long enough to know
conclusively one way or the other.
Are you running custom kernel bits at all or just a vanilla MacOS X release?
>I see other panics that look similar to each other that occur often
>also (2-5 times a month) and so I'm hoping that I can learn how to
>debug them as well, without having to ask for step-by-step help from
>this list. ;)
I hate to do like so many others on Apple Discussions do, but you may want to
take a look here, if you haven't already:
<http://www.thexlab.com/faqs/kernelpanics.html>
>But I'm not sure if the bug is in a 3rd party extension or a kernel
>bug. I guess that is what I'm trying to figure out.
It's not always that easy. The traceback may be able to tell you more, but
you get out of it what you put into it. :D
The Apple Tech Note tells you exactly how to find out what your traceback is
at the time of the panic, and that in turn may help you to narrow things down.
For example, if the traceback tells you you were in say "bcopy," that's not
going to help unless it always points back to one particular use of bcopy (say
via a kext used to support one particular piece of hardware.)
>And I'm curious how much the log tells me.
It tells you the call trace leading up to the panic, so you can usually determine
how you got to the faulting address; WHY is a much bigger question and generally
requires knowledge of both the kernel and the CPU architecture in question.
>I'm thinking now that I am going to set up a permanent panic dump
>server to store core dumps so I can get much more information and not
>need to rely on reading the Egyptian panic logs.
Really, if you can't decipher the panic logs, all the core dumps will do is eat
disk space.
>So to figure out what code is doing this, I should probably get a
>core dump, correct? I can't figure that out from the log, right?
>Oh, I remember why I had the question about the PC value. I remember
>reading somewhere that you can map some addresses in the log to code,
>and that webpage didn't have PC values, and the PC value seemed like
>the closest thing (I think it is, I'll have to re-read that page).
To reiterate, the traceback will tell you that; I can't explain it all any better than
Apple did:
<http://developer.apple.com/technotes/tn2002/tn2063.html>
>Has anyone noticed there is no Darwin kernel panic webpages that are
>targeted for people between developer and end user? Webpages are
>either in the camp that say swap RAM, or they are in the camp that
>says launch gdb.
Well, it's a little like saying web pages on medical conditions either say to
"see your doctor" or go into complex medical terminology for medical professionals;
there really isn't that much middle ground. A kernel panic is just the kernel's
way of giving a "The Application Unexpectedly Quit" dialog box; if you're not
already familiar with the app in question and/or aren't a developer, the
application crash logs MacOS X saves won't really help you all that much, either.
Using gdb to get the traceback from the addresses in the panic log is THE mechanism
by which you get ANY more information out of what was saved, and what it tells you
won't be of all that much use beyond the traceback without knowledge of the CPU you're
debugging, PPC or Intel. To use another analogy, when your car's "Check Engine" light
comes on you can get a code reader that will tell you it's because of a "Cylinder #3
misfire," but without knowledge of your engine and how it works, that's not
really going to help you any...
William Kucharski
email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden