Re: strange panic, debugging help wanted..
site_archiver@lists.apple.com Delivered-To: darwin-kernel@lists.apple.com Hope this helps. Godfrey On Dec 22, , at 6:03, Andrew Gallatin wrote: The crash behaviour itself is very strange. The guy on site (machines are 3K miles from me) told me one crashed machine was scrolling in an infinate loop (he watched it for 10 minutes) printing register dumps. The latest panic.log I've gotten is very strange, with the stack trace on the panic'ed CPU always in what appears to be trap handling code. Here is the only panic log I've been able to get: Tue Dec 21 18:08:48 2004 Kernel version: Darwin Kernel Version 7.4.0: Wed May 12 16:58:24 PDT 2004; root:xnu/xnu-517.7.7.obj~7/RELEASE_PPC Kernel version: Darwin Kernel Version 7.4.0: Wed May 12 16:58:24 PDT 2004; root:xnu/xnu-517.7.7.obj~7/RELEASE_PPC I have the feeling that this is more of a "collateral damage" sort of panic, and the real problem has corrupted some state badly enough that any information regarding the real panic or exception is lost. Am I correct? Any suggestions on how to debug this? I'm setting up kernel dumps now, but I don't have much hope that I'll get one. Thanks, Drew _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/gvdl%40apple.com This email sent to gvdl@apple.com _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... I'd like to see the assembly that this kernel is running at 0x2CAE8. I'd be willing to bet an offset from a NULL pointer is being taken and that is why you are panicing. I think the 'r1' panic is a red herring, the first exception state is 'PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000; LR=0x0002 CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access)' Indicates that the r1 is valid at the time that the panic is taken. Do you have a symbolled kernel for the version that is taking the panic. If you can find out what routine was passed a NULL pointer you may have a suspect. Finally when you see this sort of crash, that is a crash in the kernel but none of your code is in the backtrace, then I'd suspect a teardown race of some sort. You or somebody else may be zero-ing a pointer early, or perhaps you are using a freed data structure. When this happens you often trash the data that is being resused by someother parts of the system. Our shop runs an automated "tinderbox" where a number of machines re-load our driver and run a series of QA tests each time a developer commits something to our source tree. In the last few days, some changes (probably changes in our driver) have caused our G5s to start crashing randomly. Since the crashes are random, they are hard to track down to a specific commit. Unresolved kernel trap(cpu 1): 0x300 - Data access DAR=0x00000000000000D4 PC=0x0 00000000002CAE8 Latest crash info for cpu 1: Exception state (sv=0x1DD39000) PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000; LR=0x0002 CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access) Backtrace: 0x0002CAD8 0x0002C8A8 0x0002C870 Proceeding back via exception chain: Exception state (sv=0x1DD39000) previously dumped as "Latest" state. skipping... Exception state (sv=0x009D8500) PC=0x00000000; MSR=0x0000D030; DAR=0x00000000; DSISR=0x00000000; LR=0x0000 0000; R1=0x00000000; XCP=0x00000000 (Unknown) panic(cpu 1): 0x300 - Data access Latest stack backtrace for cpu 1: Backtrace: 0x0008391C 0x00083E00 0x0001EDA4 0x00090E60 0x0009426C Proceeding back via exception chain: Exception state (sv=0x1DD39000) PC=0x0002CAE8; MSR=0x00001030; DAR=0x000000D4; DSISR=0x40000000; LR=0x0002 CAD8; R1=0x0CC33DB0; XCP=0x0000000C (0x300 - Data access) Backtrace: 0x0002CAD8 0x0002C8A8 0x0002C870 Exception state (sv=0x009D8500) PC=0x00000000; MSR=0x0000D030; DAR=0x00000000; DSISR=0x00000000; LR=0x0000 0000; R1=0x00000000; XCP=0x00000000 (Unknown) This email sent to site_archiver@lists.apple.com
participants (1)
-
Godfrey van der Linden