Re: kernel lockup
Re: kernel lockup
- Subject: Re: kernel lockup
- From: Steven Bytnar <email@hidden>
- Date: Fri, 19 Apr 2013 15:22:00 -0500
Hi,
Instead of a full core dump, how about a summary of the core dump?
This requires the kernel debug kit, but this used to be be a
pretty good summary of what the machine was doing at the
time of a panic. I used this with 10.5 to troubleshoot some
third party software. It might need to be updated for 10.8.
$ cat pd.sh
echo Start:
date
echo Working on $1
gdb -c $1 -x pd.gdb > $1.txt
echo End:
date
$ cat pd.gdb
add-symbol-file /Volumes/KernelDebugKit/mach_kernel
source /Volumes/KernelDebugKit/kgmacros
showallstacks
showallthreads
showalltasks
showcurrentthreads
showcurrentstacks
showallvm
zprint
quit
$ ./pd.sh {core-file-name}
--Steve
On Fri, Apr 19, 2013 at 10:02:39PM +0200, Andreas Fink wrote:
> did that. [1]radar://13696346
> Unfortunately the kernel coredump is too big to upload (several
> gigabytes).
> And now it dumps even after the reboot sometimes.
> On 18.04.2013, at 17:51, Shantonu Sen <[2]email@hidden> wrote:
>
> You can use FireWire KDP if the Ethernet interfaces stop working (see
> fwkdp(1) or the tech note on this) to attach to the kernel debugger and
> take a core dump. Depending on the exact issue, Ethernet may work for
> KDP even if the OS IP stack gets sad. The core dump should indicate the
> culprit, especially if you start with a proximal symptom such as a
> hanging process and trace the dependency change of resources or locks.
> Please file a Radar with the coredump
> Shantonu
> On Apr 18, 2013, at 7:20 AM, Andreas Fink <[3]email@hidden>
> wrote:
>
> Hi Folks,
>
> I'm running into some kernel related deadlocks here under 10.8.3 which
> I can not really figure out where to look further.
> We have the following setup:
>
> XServe with two ethernets.
> en0 private IP's
> en1 public IPs.
>
> on en1 we have several 100's of open tcp sessions at times and thats
> where all traffic comes in and gets processed (its SMPP protocol)
> The traffic is answered inside our application and processed and put
> into a MySQL database (which is connected over en0).
> a couple of hours later, the system "locks up". Now what really
> happens is the following:
>
> a) you can no longer ping en1, nor does any sockets still work on it.
> b) you can still ping en0
> c) on en0, established sessions still work, however opening a new ssh
> session for example doesn't work.
> d) typing commands in a still working session most of the time locks
> up the system. for example "killall myapp" doesn't do nothing and just
> stalls.
> e) syslog doesnt show anything spurious.
> f) my app is still in memory and runs fine
> g) "top" was showing little CPU load, plenty of free memory. All looks
> normal.
> h) netstat -m was not showing any dangerous buffer overflowing.
> i) an established remote desktop session gets killed
> j) The appplication doesn't crash,
> h) The kernel doesn't panic.
>
> I was able to run a tcpdump on the interface while this was happening
> and what I see towards the end is that out of a sudden tcp
> retransmissions start to pile up. We see lots and lots of them out of
> the blue.
> In other words, the kernel seems to stop processing the packets
> somehow and doesn't acknowledge it to the remote anymore. Also
> incoming acknowledgments don't get processed.
> A few seconds later you can't do nothing with the machine anymore and
> you have to force reboot it over LOM (I praise Apple for implementing
> LOM into their XServers, even though it has its issues too).
>
> It is obvious that the application/traffic somehow manages to saturate
> some kernel resource which makes that specific ethernet interface
> being locked up with a side effect on to the whole kernel (like not be
> able to load any binaries not in memory already).
>
> I'm a bit lost to where look further to analyze this issue.
> Does anyone on this list might have a hint what could happen here?
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Darwin-kernel mailing list ([4]email@hidden)
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to [6]email@hidden
>
> Links:
> 1. file:///var/folders/Jw/JwJJw00g2Ra53k+1Ynt6pU+++TM/-Tmp-//radar://
> 2. mailto:email@hidden/
> 3. mailto:email@hidden/
> 4. mailto:email@hidden/
> 6. mailto:email@hidden/
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Darwin-kernel mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden