Re: NSTask/NSPipe STDIN hangs on large data... (+ code question)
Re: NSTask/NSPipe STDIN hangs on large data... (+ code question)
- Subject: Re: NSTask/NSPipe STDIN hangs on large data... (+ code question)
- From: "Joe Pezzillo" <email@hidden>
- Date: Mon, 20 Jan 2003 21:16:49 -0700
Daryn-
Thanks for your insightful reply.
Based on your info regarding the pipe buffer size, I tried yet another
approach to the problem which was to not only chunk the STDIN data into
smaller chunks, but then to try and writeData + close to flush the input
pipe after each chunk is written. However, even if I try to get the Pipe and
its fileHandleForWriting before each write, once I've closed the handle I
can't get it back, so the first chunk gets written but that's it before an
exception is raised. Did you have another idea of how I could implement
around this 8KB buffer so I can send more data to (certain) UNIX commands
via an NSTask/NSPipe? How would I do the output polling you mention (quoted
below) other than via readInBackgroundAndNotify or a blocking read loop?
Especially if the pipe hangs during the write command, I'll never get a
chance to poll for any output on STDOUT.
Luckily, I've also discovered another clue, but I haven't quite figured out
how to use it to my advantage yet. Get this: if the large data that's sent
to the Task/Pipe is from certain files, even if they are large, it also
works. I'm suspecting something related to the difference between Mac and
UNIX line feeds, but I haven't been able to confirm it (I've tried tacking
on a trailing 0A and 0D0A to the STDIN data if it wasn't already there just
before writing, but it still hangs, I also tried looking for a trailing
0/NULL, but it wasn't there on the successful runs). I discovered this while
building a test app to share regarding this, and wanting to provide the
option of testing this against a known data source instead of just random
data that I've also used. If I feed grep "/usr/share/dict/words" (about
2.4MB) loaded via NSString's stringWithContentsOfFile: using my Task/Pipe
handler, it works! If I feed it "/var/log/httpd/access_log" (about 1.5MB),
loaded the same way, it hangs. My original random data tests are a
randomized NSString assemblage with some CRLFs in it every 80 chars or so.
So, that said, I'm also glad to be able to report that I've written a
workaround, at least for grep, that appears to function (and tests OK).
Instead of doing the chunking of the STDIN after the task is created, I
chunk the STDIN before creating the task, and then do as many tasks as it
takes to handle the entire STDIN in the smaller chunks (instantiating and
releasing each one as I go). [This may have been what you were implying
below with the "new process" solution instead of what I tried (above), and
in any event, I credit you with forcing me think about how I could craft a
workaround using a new NSTask each time. THANKS!]
One reason it may only be good for grep, and not for another command like
uniq, is that I'd need to bridge the "uniq" function across chunk
boundaries, whereas the current workaround simply chunks on a line boundary
before the chunk size, without regard for what is in any chunk or another.
I also agree with your assessment that PTYs don't seem like they'd be worth
the effort, plus they seem a little kludgy to me when NSTask/NSPipe is the
recommended method and supposed to work, presumably consistently.
And, perhaps best of all for the list, I have a project stub of this problem
(including my workaround and the two chunking write methods and some fully
working examples of other commands) ready to post to the list. I think it
might also be yet another useful introduction for others on how to use
NSTask/NSPipe both synchronously and asynch (of course, I learned it all
from the existing books and sample code, and I think anyone else can, too).
However, before I do so, this being my first time posting code or software
here, what is the policy (if any) on such a post? I'm presuming I'm going to
include a license to protect myself should this code crash anyone's machine
or otherwise run afoul. Is that uncouth here? I could restrict it to a
smaller set of routines and excerpts that I post in e-mail, but that doesn't
seem as useful.
Thanks!
Joe
email@hidden
PS - I also finally ran many of the same tests on this same program on the
10.2.3 machine with the same results.
On 1/18/03 7:41 PM, "Daryn" <email@hidden> wrote:
>
Using pipes to write to and read from a command often produces deadlock
>
unless great care is taken. The reason that some commands works on the
>
cmdline but not in your program is that many commands will line buffer
>
output to a tty (interactive), but block buffer their output to a pipe.
>
This is why grep and uniq are causing problems.
>
>
In particular, grep will use an 8KB buffer for pipe output. Until that
>
buffer fills, it won't be flushed unless the input pipe is explicitly
>
closed. Thus a single instantiation of a grep process is often not
>
useful as a general filtering mechanism. One solution is to
>
instantiate a new process each time filtering is required, send the
>
data on the input pipe, close the input pipe. Even then deadlock may
>
result unless the output pipe is polled after each input line is sent.
>
>
Psuedo-terminals (ptys) are another approach that can trick commands
>
like grep into using line buffered output. Feel free to google for
>
details and then whether it's worth the effort. The unix command set
>
often tries to be ultra-efficient with its buffering, sometimes to the
>
point of crippling its own usefulness.
>
>
On Friday, January 17, 2003, at 12:22 PM, Joe Pezzillo wrote:
>
>
> Bill-
>
>
>
> Thanks for the prompt reply, that looks like some very useful code,
>
> too...I
>
> hadn't yet considered (or desired) porting, but it's good to know that
>
> it
>
> can be done!
>
>
>
> Sadly, yes, I am already doing readInBackgroundAndNotify, at least on
>
> the
>
> asynch version. The synchronous version uses readDataToEndOfFile.
>
>
>
>
...
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.