Re: deadlock in 10.12, maybe?
Re: deadlock in 10.12, maybe?
- Subject: Re: deadlock in 10.12, maybe?
- From: Jorgen Lundman <email@hidden>
- Date: Wed, 07 Dec 2016 14:52:19 +0900
- Dkim-filter: OpenDKIM Filter v2.10.3 mail.lundman.net CED376644D
Ok that took forever to happen again, but it just did.
-rw-rw---- 1 nobody wheel 279319078 Dec 7 12:51
core-xnu-3789.21.4-172.16.248.129-4328ad84.gz
In an attempt to assist, I'll try to find the vnode in question, for the flags.
(lldb) switchtoact 0xffffff8040e72580
(lldb) bt
* thread #881: tid = 0x14d49b, 0xffffff8024f87997
kernel.development`machine_switch_context(old=0xffffff8040e72580,
continuation=0x0000000000000000, new=0xffffff8030d900d0) + 247 at
pcb.c:454, name = '0xffffff8040e72580', queue = '0x0'
* frame #0: 0xffffff8024f87997
kernel.development`machine_switch_context(old=0xffffff8040e72580,
continuation=0x0000000000000000, new=0xffffff8030d900d0) + 247 at pcb.c:454
[opt]
frame #1: 0xffffff8024e58284
kernel.development`thread_invoke(self=0xffffff8040e72580,
thread=0xffffff8030d900d0, reason=0) + 1540 at sched_prim.c:2316 [opt]
frame #2: 0xffffff8024e571ce
kernel.development`thread_block_reason(continuation=<unavailable>,
parameter=<unavailable>, reason=<unavailable>) + 286 at sched_prim.c:2795 [opt]
frame #3: 0xffffff8024e4c1e4 kernel.development`lck_mtx_sleep [inlined]
thread_block(continuation=<unavailable>) + 11 at sched_prim.c:2811 [opt]
frame #4: 0xffffff8024e4c1d9
kernel.development`lck_mtx_sleep(lck=0xffffff8032b159b0,
lck_sleep_action=0, event=0xffffff8032b15a14, interruptible=<unavailable>)
+ 121 at locks.c:800 [opt]
frame #5: 0xffffff802531d893 kernel.development`_sleep(chan="\x02",
pri=20, wmsg="vnode_drain", abstime=0, continuation=<unavailable>,
mtx=0xffffff8032b159b0) + 467 at kern_synch.c:201 [opt]
frame #6: 0xffffff80250a7358 kernel.development`vnode_reclaim_internal
[inlined] msleep(pri=20, wmsg="vnode_drain") + 184 at kern_synch.c:348 [opt]
frame #7: 0xffffff80250a7333 kernel.development`vnode_reclaim_internal
[inlined] vnode_drain + 39 at vfs_subr.c:4643 [opt]
frame #8: 0xffffff80250a730c
kernel.development`vnode_reclaim_internal(vp=<unavailable>, locked=1,
reuse=1, flags=<unavailable>) + 108 at vfs_subr.c:4804 [opt]
frame #9: 0xffffff80250ab772
kernel.development`vflush(mp=<unavailable>, skipvp=0x0000000000000000,
flags=<unavailable>) + 978 at vfs_subr.c:2154 [opt]
frame #10: 0xffffff80250b6b8b
kernel.development`dounmount(mp=<unavailable>, flags=<unavailable>,
withref=1, ctx=<unavailable>) + 923 at vfs_syscalls.c:1963 [opt]
frame #11: 0xffffff80250b665a
kernel.development`unmount(p=<unavailable>, uap=0xffffff803184b550,
retval=<unavailable>) + 410 at vfs_syscalls.c:1785 [opt]
frame #12: 0xffffff80253e607f
kernel.development`unix_syscall64(state=<unavailable>) + 719 at
systemcalls.c:380 [opt]
frame #13: 0xffffff8024de6f46 kernel.development`hndl_unix_scall64 + 22
Since the 'vp' variable isn't directly available, I took some guesses;
frame #8: 0xffffff80250a730c
kernel.development`vnode_reclaim_internal(vp=<unavailable>, locked=1,
reuse=1, flags=<unavailable>) + 108 at vfs_subr.c:4804 [opt]
(lldb) register read
GPR:
rbx = 0xffffff8032b159b0
rbp = 0xffffff90d0553c50
rsp = 0xffffff90d0553c20
r12 = 0xffffff8032b15a08
(lldb) showvnode 0xffffff8032b159b0
vnode usecount iocount v_data vtype parent
mapped cs_version name
0xffffff8032b159b0 2 2 0xffffff90cd702d88 VDIR
0xffffff8033b6bf80 - - 3C836C29-F9D8-48EB-801E-817468FE3B07
(Confirm it is valid by following the parents)
3C836C29-F9D8-48EB-801E-817468FE3B07 -> Store-V2 -> .Spotlight-V100 -> 0
(Oh I see, you use vnode as the condvar ptr too, so could just have taken
it from the lck_mtx_sleep call)
That vnode is:
(lldb) p/x *((struct vnode *)0xffffff8032b159b0)
(struct vnode) $20 = {
v_lock = {
opaque = ([0] = 0x0000000000000000, [1] = 0xffffffff00000000)
}
v_freelist = {
tqe_next = 0x0000000000000000
tqe_prev = 0x00000000000deadb
}
v_mntvnodes = {
tqe_next = 0x0000000000000000
tqe_prev = 0xffffff80370e5080
}
v_ncchildren = {
tqh_first = 0x0000000000000000
tqh_last = 0xffffff8032b159e0
}
v_nclinks = {
lh_first = 0x0000000000000000
}
v_defer_reclaimlist = 0x0000000000000000
v_listflag = 0x00000000
v_flag = 0x00084800
v_lflag = 0xc00e
v_iterblkflags = 0x00
v_references = 0x14
v_kusecount = 0x00000001
v_usecount = 0x00000002
v_iocount = 0x00000002
v_owner = 0xffffff8040e72580
v_type = 0x0002
v_tag = 0x0011
v_id = 0x0623e334
v_un = {
vu_mountedhere = 0x0000000000000000
vu_socket = 0x0000000000000000
vu_specinfo = 0x0000000000000000
vu_fifoinfo = 0x0000000000000000
vu_ubcinfo = 0x0000000000000000
}
v_cleanblkhd = {
lh_first = 0x0000000000000000
}
v_dirtyblkhd = {
lh_first = 0x0000000000000000
}
v_knotes = {
slh_first = 0x0000000000000000
}
v_cred = 0xffffff802f74e7f0
v_authorized_actions = 0x00000880
v_cred_timestamp = 0x00000000
v_nc_generation = 0x00000006
v_numoutput = 0x00000000
v_writecount = 0x00000000
v_name = 0xffffff802f2236b0 "3C836C29-F9D8-48EB-801E-817468FE3B07"
v_parent = 0xffffff8033b6bf80
v_lockf = 0x0000000000000000
v_op = 0xffffff8036b22000
v_mount = 0xffffff80370e5040
v_data = 0xffffff90cd702d88
v_label = 0x0000000000000000
v_resolve = 0x0000000000000000
}
Although, vflush() has
if (((vp->v_usecount == 0) ||
((vp->v_usecount - vp->v_kusecount) == 0))) {
[snip]
vnode_reclaim_internal(vp, 1, 1, 0);
So that is confusing. But I will bow out here.
But if the PanicDump is still wanted, say the word and I'll go file a
radar, or just make it available for download.
Lund
Vivek Verma wrote:
> Can you file a radar for this (with preferably the kernel core dump attached) ? I would like see some vnode state which isn't in just the stack trace.
>
>
--
Jorgen Lundman | <email@hidden>
Unix Administrator | +81 (0)90-5578-8500
Shibuya-ku, Tokyo | Japan
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden