Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Understanding cores...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Understanding cores...

Subject: Re: Understanding cores...
From: Michael Tuexen <email@hidden>
Date: Mon, 8 Jan 2007 22:05:31 +0100

Hi Derek,

comments in-line.

Best regards
Michael

On Jan 8, 2007, at 7:39 PM, Derek Kumar wrote:

(gdb) paniclog panic(cpu 2 caller 0x001A3135): Unresolved kernel trap (CPU 2, Type 14=page fault), registers: CR0: 0x8001003b, CR2: 0xffffffe0, CR3: 0x011b0000, CR4: 0x000006e0 EAX: 0x00000000, EBX: 0x352c4038, ECX: 0x352c4000, EDX: 0x352dd008 CR2: 0xffffffe0, EBP: 0x00000000, ESI: 0x352f4010, EDI: 0x004ae85c EFL: 0x00010046, EIP: 0x00135f3b, CS: 0x00000008, DS: 0x03820010

Backtrace, Format - Frame : Return Address (4 potential args on stack) 0x252a3d68 : 0x128d1f (0x3c9540 0x252a3d8c 0x131df4 0x0) 0x252a3da8 : 0x1a3135 (0x3cf1f4 0x2 0xe 0x3cea24) 0x252a3eb8 : 0x19a8d4 (0x252a3ed0 0x352f4000 0x252a3f28 0x352f4000) Backtrace terminated-invalid frame pointer 0x0

Kernel version: Darwin Kernel Version 8.8.1: Mon Sep 25 19:42:00 PDT 2006; root:xnu-792.13.8.obj~1/RELEASE_I386

task vm_map ipc_space #acts pid proc command 0x03808da0 0x013e3f3c 0x037d2ef0 54 0 0x004d2200 kernel_task activation thread pri state wait_queue wait_event 0x03822e68 0x03822e68 0 IR reserved_stack=0x24ff8000 kernel_stack=0x252a0000 stacktop=0x252a3d68 0x252a3d68 0x128d1f <panic+382> 0x252a3da8 0x1a3135 <kernel_trap+1538> 0x252a3eb8 0x19a8d4 <trap_from_kernel+19> stackbottom=0x252a3eb8
As Brian noted, this is a page fault that could not be satisfied, and on the kernel version noted in the panic log, the faulting instruction corresponds to the idle_thread() function. This is the machine independent entry to the processor idle code, which essentially looks for threads to dispatch and invokes the machine specific power management state machine/idle power state as appropriate. The fault was triggered whilst trying to load a local (using the base+displacement mode), but the framepointer (EBP) was 0, leading to the fault. Nothing directly points to your NKE as far as I can tell, but one possibility is memory corruption leading to the zeroing of the EBP record in the register state stored at the base of the kernel stack (which is reloaded at context switch time). You say your system "crashes a lot"--does it crash consistently in this manner, and only when your NKE is loaded?

Well on that system the NKE is load always, because it is required by the application running on that Mac Pro. In the meantime another system was setup (on a different hardware) the it also crashes a lot. BTW: A lot means a couple of times per day. And yes, the cores we get (using a core dump server), are all like this (some pointed to bugs in the NKE in the past, but these could be fixed, then also SCTP.kext was explicitly mentioned in the paniclog).

Any idea how to narrow down the problem?

Derek gdb) x/i 0x135f3b 0x135f3b <idle_thread+120>: mov -32(�p),�x which corresponds to 2544 while ( (*threadp == THREAD_NULL) && 2545 (*gcount == 0) && (*lcount == 0) ) and "threadp" is a local, and EBP, the framepointer, happens to be 0, leading to a page fault


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Understanding cores...
From: Derek Kumar <email@hidden>


References:  
  >Understanding cores... (From: Michael Tuexen <email@hidden>)
  >Re: Understanding cores... (From: "Brian Bechtel" <email@hidden>)
  >Re: Understanding cores... (From: Derek Kumar <email@hidden>)




Prev by Date:
Re: Understanding cores...

Next by Date:
Re: User-space to kernel communication

Previous by thread:
Re: Understanding cores...

Next by thread:
Re: Understanding cores...

Index(es):

Date
Thread