Re: Getting last data from child process in Leopard
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com On Mar 26, 2008, at 8:36 AM, Ingemar Ragnemalm wrote: That is what select(2) is for. Steve Checkoway wrote: Here is a citation (Stevens&Rago page 680): -- Steve Checkoway "Anyone who says that the solution is to educate the users hasn't ever met an actual user." -- Bruce Schneier _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com I suppose poll() can save some processing, at least if you are polling a whole bunch of fd's, but calling poll() in a tight loop should be little or no better than read(). Rather than spam the list with my version of your code, you can get it here: <http://pahtak.org/~steve/child.c>. I would have just posted a diff like Jordan did, but I couldn't handle your inconsistent spacing. Thanks, but... you do the fflush() trick in the child, which sure helps if you have control over the child, but that is not the case, so it is one of the things to avoid. I did that because stdio functions like fwrite() and printf() do not flush. Actually, printf() flushes when it encounters \n as it turns out, but I don't know if that is behavior you want to depend on. I could have used write() instead which wouldn't have been buffered. So: - You can't play the "read as mad" trick that Jordan K. Hubbard suggested, because it will kill performance. It will usually help though. - You can't add neither a fflush() or a sleep() in the child. If the child is a single printf, it should work. And if the child happens to be GCC (which it will be), it should work without recompiling GCC with a ton of fflush'es. If if the child is using buffered IO, it's going to be buffered no matter what you do. It seems to work correctly for me written that way. The signal handler in my version does nothing but set a flag and the while loop ends, not when the handler fires, but when read() returns 0, i.e., end of file. It seems to me that the signal doesn't do anything meaningful in your code? That's right, it merely notes when the child dies and prints out a notice to that effect. I didn't manage to get the signal to interrupt the read(), but that's probably because of the nonblocking IO. I'm not sure why you think you need unbuffered IO to avoid deadlocks or to deliver data quickly. "In the coprocess example... we couldn't invoke a coprocess... because when we talked to the coprocess across a pipe, the standard I/O library fully buffered the standard input and standard ouitput, leading to a deadlock. If the coprocess is a compiled program for which we don't have the source, we can't add fflush()... What we need to do is to place a pseudo terminal between the two processes..." In my testing (go ahead, remove your \n in the printf statements), it was being flushed regardless. I actually removed the \n first because it was screwing up your nicely planned <%s> output format. Then I was a bit surprised at first to notice that I didn't get any data at all until it slept. So I know pretty well why I "think" that I need this. And in Tiger, this is 100% true. I was helpless until I found the pty's, and then it worked, smooth and reliable. Until now. As you probably noticed, printf() on the child side was buffering anyway. With a pty, it shouldn't buffer the pipe to a deadlock. That's the point. But maybe that is the problem, that pty's in Leopard buffer data when Tiger does not. Could be, I don't still have Tiger around to try. Then again, both Leopard and the version of Linux I was using before both buffer the printf() if I remove the fflush(). The difference is that Linux reports child quit, then it reads 16 bytes "hello!there!BYE!" whereas Leopard loses the data. This must be the loss you were describing. Maybe that is a bug, I'm not in a position to say. Jordan or Terry know far more than I. But there is one interesting detail here: Your output is indeed buffered, so it arrives all at one time in the end (in Tiger), while my code is delivered unbuffered. That is a pretty significant difference that I can't explain yet. Why are you waiting for the child with wait()? When the signal works (which it does if put after the forkpty()) then there is nothing to wait for. And when you get the EIO, isn't that a safe sign too? I'm waiting because my child terminated and I wanted to clean it up rather than leave the process hanging around until the parent process terminates at which time the child gets cleaned up by something else. The signal does work for me. I put the signal first because otherwise there is a race condition. What happens if the child terminates before you've set up the handler? You'd get no signal. As a side note, this runs on (at least this particular flavor of) Linux as well with the small changes (included in the file) to include <pty.h> instead of <util.h> and to break out of the loop when read() returns -1 and errno is EIO. It is all built from common Unix calls so it should work. Didn't link in RHEL though, and that's the only Linux I have easily accessible. I don't have red hat, but it worked for me on both ubuntu (ppc) and fedora (x86). Did you remember to pass -lutil to gcc? Here is an even shorter and simpler version, no read() in the signal, same problems: http://www.ragnemalm.se/stuff/ptytrouble-smaller.c You're still using buffered IO in the signal handler. You simply cannot do that. Read the sigaction(2) man page. I still use signals. I can't really see the checks for EAGAIN, EINTR and EIO making much of a difference, but since our programs do behave differently, I will pursue that a bit more, and see what termination on EIO gives me. Anyway, they don't remove the main problem, the lost data. You're treating an error return (-1) with end of file (0) and in both cases, you do nothing but sleep. You're still using the signal handler as an indication that there is no more data to read. At least with Linux, that is simply not true as when I remove the fflush(), I get SIGCHLD before read() returns the data. I tried rewriting this to use pipe(2) and select(2). Everything seems to be working except that dup2(2) and close(2) are returning strange values on the child side of things. dup2() is failing with errno set to 0 and yet seeming to succeed. close() is failing with errno set to 22, but the argument looks correct to me. At the very least, dup2() shouldn't be returning -1 since I'm printing out the error messages to stdout and the parent is reading the error messages. The code is here <http://pahtak.org/~steve/child2.c
. I'm probably just doing something wrong and I'm not noticing it. smime.p7s
participants (1)
-
Steve Checkoway