Re: panic.log on Intel?
Re: panic.log on Intel?
- Subject: Re: panic.log on Intel?
- From: Derek Kumar <email@hidden>
- Date: Wed, 18 Oct 2006 15:18:17 -0400
The 10.4.8/Intel kernel (and, I believe, some PPC server versions,
but I haven't verified this) performs ARP discovery, and also adds a
couple of boot-args to specify the IP of the router to forward to (if
discovery fails, useful if the default gateway doesn't forward to the
specified server) and another to specify the crashdump server's port.
There's also a boot-arg that specifies the interface the kernel
debugger (and crashdumps) are bound to. These are, respectively,
"_router_ip", "panicd_port" and "kdp_match_name".
Derek
On Oct 18, 2006, at 2:33 PM, Peter Lovell wrote:
A small note on the kernel dump server.
In order for this to work (at least as recently as 10.4.6, haven't
checked 10.4.7 and there's no source available yet for 10.4.8) you
should plan to have your dump server on a different subnet.
This is because the kernel dump code on the crashed machine does
not ARP for the ethernet address of the dump server. Instead it
sends it to the default gateway and expects the gateway to forward it.
Some do, many do not. Even for some routers which support same-
subnet-forwarding, sysadmins not infrequently disable it for
various reasons.
Your choices are to put the dump server on a different subnet (the
router will obviously forward in this case, as that what it must
do) or write your own arp code in a modified kernel. That's what we
did, since we already had kernel mods. Most folks won't go that way :)
Regards.....Peter
On Oct 13, 2006, at 5:26 PM, Brian Bechtel wrote:
On 10/13/06, Andrew Gallatin <email@hidden> wrote:
Derek Kumar writes:
> As to why the backtrace doesn't show any more, it's because the
> framepointer linkage was terminated prematurely
(zeroed)...could be
> stack corruption. If you can connect to the machine or examine a
> crashdump, you can try trawling the stack for traces. I'd also
> suggest verifying that you were, in fact, on the appropriate
thread
> stack at the time of the panic.
Thank you. That was quite helpful. I imagine we're
doing something to induce random memory corruption,
which is not a good sign. Unfortunately, this customer
is pretty clueless, and cannot setup dumps.
Here's a simple thing. Set up a kernel coredump server on one of
your
machines. It doesn't have to be fast, but it helps to have enough
hard disk space. On the client (i.e. panic-prone machine)
sudo nvram boot-args="debug=0xd44 _panicd_ip=10.0.40.2"
{and reboot for this to take effect.}
where you replace 10.0.40.2 with the IP address of your coredump
server. You must restart to enable this setting. Next time that
machine panics, you'll get a core dump.
To setup your coredump server (from
<http://developer.apple.com/technotes/tn2004/tn2118.html>) do the
following:
mkdir /PanicDumps
chmod ugo+wx /PanicDumps
cat > /tmp/macosxkdump <<EOF
service macosxkdump
{
disable = no
type = UNLISTED
socket_type = dgram
protocol = udp
port = 1069
user = nobody
groups = yes
server = /usr/libexec/kdumpd
server_args = /PanicDumps
wait = yes
}
EOF
sudo cp /tmp/macosxkdump /etc/xinetd.d
sudo kill -HUP `cat /var/run/xinetd.pid`
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden