-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday, March 20, 2003, at 10:50 AM, Stockhoff, Cory wrote:
Hello Everybody,
I apologize for the length of this post, but I hoped to provide enough
info to give somebody an "aha!" moment.
Aha! :)
Our product consists of user and kernel components implementing a VPN
client. The kernel component is a KEXT that processes inbound and
outbound
IP packets intercepted by a DLIL. The DLIL is pushed onto every
network
device that becomes active.
The crash we are seeing is a kernel panic that occurs only on
multiple-CPU
hosts, and is infrequent. Our KEXT tries to allocate some kernel
memory by
calling "kmem_alloc()". The panic message is always "panic(cpu [0 or
1]):
thread_invoke: preemption_level 1". The most reliable way to
duplicate the
problem is to engage the software (i.e. connect to a VPN switch via an
encrypted VPN tunnel), mount a remote shared volume (AppleTalk over
IP), and
do heavy reads and writes to the volume.
Our customers have seen the problem on both Jaguar (10.2.x) and Puma
(10.1.x) OS's. Our testing and debugging has been on Jaguar hosts,
10.2.3
(Darwin 6.3). Our application is built on Puma.
Various checks have been tried, none successful:
1) Don't call kmem_alloc() if preemption level is 1. It turns out
that
the preemption level is often 1 during normal operation, and is
usually not a problem. However, not returning allocated memory
every time preemption level is 1 does cause problems for our
software.
If you are holding a spinlock or, you are allocating from a network callback thread, then you may see this crash when allocating memory. kmem_alloc can block and blocking is a big no-no when holding a spinlock or when interrupts are disabled. If alloc does block, then you will get this panic, if it doesn't block then you won't panic. Unfortunately, there is no easy way to tell if an allocation will cause alloc to block. To get around this problem, I allocated a reserve pool at kext load time and at allocation time if interrupts are disabled and kalloc_noblock() returns nil, I grab a block from this reserve. This worked for me in this particular case, because we always disable interrupts when grabbing a spinlock. Here is some pseudo code if (interrupts enabled) { mem = kalloc(size); } else { mem = kalloc_noblock(size); if (!mem) mem = my_reserve_alloc(); } return (mem);
The stack trace can vary a bit, but this one is typical (and from
Jaguar):
0x000856cc print_backtrace+176
0x00085afc Debugger+108
0x000287a8 panic+488
0x00033eec thread_invoke+72
0x000344d0 thread_block_reason+212
0x0008d51c mlInUse+16
0x00060b60 vm_fault_wire_fast+284
0x00064f88 vm_map_wire_nested+2988
0x000651a0 vm_map_wire+120
0x00061a78 kernel_memory_allocate+600
You can see the problem in the backtrace, the thread tries to block because the allocation could not be immediately filled (thread_block_reason). BTW, this problem can happen on single CPU machines too, but is much more likely to happen on dualies. HTH. Brian Bergstrand <http://www.classicalguitar.net/brian/> PGP Key ID: 0xB6C7B6A2 It is easy to convert a bug to a feature; document it. - The Microsoft Employee Handbook, p.215, section B, article VII - -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBPnnuwXnR2Fu2x7aiEQIpAACfZkR8pRqVjeQniPIv2Ynjh/IV5+4AoLPO S/GvkjmAqouvnX+GDOTQebJ/ =iXu+ - -----END PGP SIGNATURE----- Brian Bergstrand <http://www.classicalguitar.net/brian/> PGP Key ID: 0xB6C7B6A2 Genius may have its limitations, but stupidity is not thus handicapped. - - Elbert Hubbard -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBPnnu+3nR2Fu2x7aiEQIbDACfVgmMfGfXWvZdALXDuYaz4AK88XUAoNia O84epg7QHYQUZgxzwSA3avXG =nwFc -----END PGP SIGNATURE----- _______________________________________________ darwin-kernel mailing list | darwin-kernel@lists.apple.com Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel Do not post admin requests to the list. They will be ignored.
participants (1)
-
Brian Bergstrand