Re: limits of objects
Re: limits of objects
- Subject: Re: limits of objects
- From: Chris Hanson <email@hidden>
- Date: Wed, 20 Aug 2003 18:39:02 -0500
There are two separate issues here.
First, not knowing anything about AGRegex, it appears that you're
asking for it to find all matches at once. Don't do that. Iterate
over all of the matches instead. Your code will be much more scalable
*and* you'll be able to present nice, reasonably accurate progress
information while it's going on.
Second, and more importantly, heed the warnings of those who say you
cannot do PDF parsing with regular expressions. I believe parsing PDF
requires state to be maintained; regular expressions are just a form of
finite state machine and hence can't have additional state (unless
you're using a form of extended regular expression that implements a
Turing machine, like those available in Perl).
To parse PDF you probably need to actually write a parser, probably a
recursive-descent parser, and use that to turn a PDF into a collection
of objects you can interrogate and manipulate. Fortunately PDF isn't
actually very difficult to parse this way; it's not as trivial as Lisp
S-expressions, but it's not as difficult as C.
-- Chris
--
Chris Hanson, bDistributed.com, Inc. | Email: email@hidden
Custom Mac OS X Development | Phone: +1-847-372-3955
http://bdistributed.com/ | Fax: +1-847-589-3738
http://bdistributed.com/Articles/ | Personal Email: email@hidden
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.