Re: Piping the result of a Unix command
Re: Piping the result of a Unix command
- Subject: Re: Piping the result of a Unix command
- From: "Alastair J.Houghton" <email@hidden>
- Date: Fri, 3 Oct 2003 14:39:31 +0100
On Friday, October 3, 2003, at 01:06 pm, Lorenzo wrote:
I have just put the unix command "pdftotext" on the root of my boot
disk.
You really should use your home directory rather than the root of your
disk ;-)
So I try:
NSArray *args = [NSArray arrayWithObjects:filePath, @"-", nil];
NSTask *task = [[NSTask alloc] init];
NSPipe *thePipe = [NSPipe pipe];
[task setStandardOutput:thePipe];
[task setLaunchPath:@"/pdftotext"];
[task setArguments:args];
[task launch];
[task waitUntilExit];
NSData *dataOut = [[[task standardOutput] fileHandleForReading]
availableData];
and my program freezes.
Your program will freeze on the call to -waitUntilExit, because the
pipe will fill-up (well, at least it will if your PDF has a non-trivial
amount of text in it), which will cause the kernel to suspend pdftotext
until your program reads data from the pipe. Unfortunately, your
program is waiting for pdftotext to exit, which it won't do because it
hasn't finished extracting text yet.
Try doing
...
[task launch];
NSData *dataOut = [[thePipe fileHandleForReading] readDataToEndOfFile];
[task waitUntilExit];
...
instead.
BTW, although it probably isn't worth it in this case, it is sometimes
worth considering dealing with the data a bit at a time, rather than
all at once... for example, rather than doing
NSData *dataOut = [[thePipe fileHandleForReading] readDataToEndOfFile];
[task waitUntilExit];
/* Process data here */
you could do
NSFileHandle *theHandle = [thePipe fileHandleForReading];
for (;;) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSData *dataOut = [theHandle availableData];
if (![dataOut length]) {
/* End of file, so exit */
[pool release];
break;
}
/* Process the data in "dataOut" */
[pool release];
}
[task waitUntilExit];
The code is slightly more complicated, and your processing might be a
little more awkward because you might have to keep some state (e.g. if
you were searching for a string, you'd have to keep track of the state
of your search so that you would detect strings if they got split over
more than one NSData). The advantage is that the latter technique will
work with very big files (even those many GB in size) without
allocating lots of memory and causing unnecessary swapping; the former
technique is only really appropriate for small to medium sized things
(perhaps a few hundred KB in size), things that you need to hold in
memory for some other reason (an image, for example), or for test
programs (where you aren't so bothered).
Kind regards,
Alastair.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.