Re: Kernel bug in handling signals (bug 15615281)
https://code.google.com/p/virgil/ The binary in question is generated by my compiler from the test program in test/execute/rtex_divzero06.v3 % bin/v3c -target=x86-darwin-test -output=/tmp/ test/execute/rtex_divzero06.v3 There is some assembly language linkage that is generated by the compiler itself that sets up the default signal handlers. That assembly is generated in https://code.google.com/p/virgil/source/browse/aeneas/src/x86/X86Darwin.v3 in genSignalHandlerInstall() That code is basically: // generate code that installs a signal handler def genSigHandlerInstall(asm: X86Assembler, signo: int, handler: Addr) { asm.push_i(0); // sa_flags asm.push_i(0); // sa_mask asm.push_i(X86Addrs.ABS_CONST); // sa_handler: handler address recordPatch(asm, handler); asm.push_i(2); // TODO: why a nonzero value here? asm.movd_rm_r(X86Regs.EBX, X86Regs.ESP); asm.push_i(0); // sigaction *oact asm.push(X86Regs.EBX); // sigaction *act asm.push_i(signo); // signal number asm.push_i(0); // "dummy" value asm.movd_rm_i(X86Regs.EAX, 46); // sigaction asm.intK(0x80); asm.add.rm_i(X86Regs.ESP, 32); // pop params off stack } As you can see there is some hackery going on there. I wasn't exactly sure what I should be passing to the kernel, since it has been a pain to trace through exactly how all the structs are encoded. This all works fine on 10.6. I use the signal handlers to catch access violations (e.g. a null pointer deref by the program generates a SIGSEGV, by design of my address layout), and then generate a stacktrace. Perhaps I am misusing the syscall. But still, I should *not* be able to crash the kernel, and the behavior certainly shouldn't depend on the 64-bitness of the forking shell (!). As noted above, when forked from a 32-bit shell, then it all works as expected. Summary: My compiler generates programs that handle signals directly from the kernel without going through any of libc. Thus these programs use the kernel system calls to set up signal handlers. There appears to be a bug on 10.8 and later (64 bit kernels), where such programs hang when trying to handle any signals. This only appears to be the case when the program handling the signal is 32bit and is forked from a 64bit shell. See steps below for test case reproduction. Steps to Reproduce: 1. Download the attached executable. 2. Run the executable from a 32-bit shell: % arch -arch i386 sh -c './rtex_divzero06 1' This should produce the expected output: !DivideByZeroException 3. Run the executable from a 64-bit shell: % arch -arch x86_64 sh -c './rtex_divzero06 1' This unfortunately hangs on 10.8 and 10.9. Worse, on 10.9, it causes a kernel panic when trying to kill the program when it is launched directly from the shell: % ./rtex_divzero06 1 <hangs> <CTRL+C> causes a kernel panic Expected Results: See above. Program should print !DivideByZeroException. Actual Results: Program hangs on 10.8 and 10.9. When CTRL+C'ing the program, causes a kernel panic on 10.9. Version: 10.8 10.9 %uname -a Darwin <machine> 13.0.0 Darwin Kernel Version 13.0.0: Thu Sep 19 22:22:27 PDT 2013; root:xnu-2422.1.72~6/RELEASE_X86_64 x86_64 Notes: The program is a compiled test case from the Virgil programming language: https://code.google.com/p/virgil/ Configuration: This always occurs on 10.8 and 10.9. It never occurs on 10.6. Have not tested on 10.7. _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.... This email sent to site_archiver@lists.apple.com
participants (1)
-
Ben L. Titzer