Re: strange panic, debugging help wanted..
Re: strange panic, debugging help wanted..
- Subject: Re: strange panic, debugging help wanted..
- From: Godfrey van der Linden <email@hidden>
- Date: Mon, 10 Jan 2005 14:06:04 -0800
I'd like to see the assembly that this kernel is running at 0x2CAE8.
I'd be willing to bet an offset from a NULL pointer is being taken and
that is why you are panicing.
I think the 'r1' panic is a red herring, the first exception state is
'PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000;
LR=0x0002 CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access)'
Indicates that the r1 is valid at the time that the panic is taken. Do
you have a symbolled kernel for the version that is taking the panic.
If you can find out what routine was passed a NULL pointer you may have
a suspect.
Finally when you see this sort of crash, that is a crash in the kernel
but none of your code is in the backtrace, then I'd suspect a teardown
race of some sort. You or somebody else may be zero-ing a pointer
early, or perhaps you are using a freed data structure. When this
happens you often trash the data that is being resused by someother
parts of the system.
Hope this helps.
Godfrey
On Dec 22, , at 6:03, Andrew Gallatin wrote:
Our shop runs an automated "tinderbox" where a number of machines
re-load our driver and run a series of QA tests each time a developer
commits something to our source tree. In the last few days, some
changes
(probably changes in our driver) have caused our G5s to start crashing
randomly. Since the crashes are random, they are hard to track down
to a specific commit.
The crash behaviour itself is very strange. The guy on site (machines
are 3K miles from me) told me one crashed machine was scrolling in an
infinate loop (he watched it for 10 minutes) printing register dumps.
The latest panic.log I've gotten is very strange, with the stack trace
on the panic'ed CPU always in what appears to be trap handling code.
Here is the only panic log I've been able to get:
Tue Dec 21 18:08:48 2004
Unresolved kernel trap(cpu 1): 0x300 - Data access
DAR=0x00000000000000D4 PC=0x0
00000000002CAE8
Latest crash info for cpu 1:
Exception state (sv=0x1DD39000)
PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000;
LR=0x0002
CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x0002CAD8 0x0002C8A8 0x0002C870
Proceeding back via exception chain:
Exception state (sv=0x1DD39000)
previously dumped as "Latest" state. skipping...
Exception state (sv=0x009D8500)
PC=0x00000000; MSR=0x0000D030; DAR=0x00000000; DSISR=0x00000000;
LR=0x0000
0000; R1=0x00000000; XCP=0x00000000 (Unknown)
Kernel version:
Darwin Kernel Version 7.4.0:
Wed May 12 16:58:24 PDT 2004; root:xnu/xnu-517.7.7.obj~7/RELEASE_PPC
panic(cpu 1): 0x300 - Data access
Latest stack backtrace for cpu 1:
Backtrace:
0x0008391C 0x00083E00 0x0001EDA4 0x00090E60 0x0009426C
Proceeding back via exception chain:
Exception state (sv=0x1DD39000)
PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000;
LR=0x0002
CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x0002CAD8 0x0002C8A8 0x0002C870
Exception state (sv=0x009D8500)
PC=0x00000000; MSR=0x0000D030; DAR=0x00000000; DSISR=0x00000000;
LR=0x0000
0000; R1=0x00000000; XCP=0x00000000 (Unknown)
Kernel version:
Darwin Kernel Version 7.4.0:
Wed May 12 16:58:24 PDT 2004; root:xnu/xnu-517.7.7.obj~7/RELEASE_PPC
I have the feeling that this is more of a "collateral damage" sort of
panic, and the real problem has corrupted some state badly enough that
any information regarding the real panic or exception is lost. Am I
correct?
Any suggestions on how to debug this? I'm setting up kernel dumps
now, but I don't have much hope that I'll get one.
Thanks,
Drew
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden