• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: limits of objects
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: limits of objects


  • Subject: Re: limits of objects
  • From: Chris Hanson <email@hidden>
  • Date: Wed, 20 Aug 2003 18:39:02 -0500

There are two separate issues here.

First, not knowing anything about AGRegex, it appears that you're asking for it to find all matches at once. Don't do that. Iterate over all of the matches instead. Your code will be much more scalable *and* you'll be able to present nice, reasonably accurate progress information while it's going on.

Second, and more importantly, heed the warnings of those who say you cannot do PDF parsing with regular expressions. I believe parsing PDF requires state to be maintained; regular expressions are just a form of finite state machine and hence can't have additional state (unless you're using a form of extended regular expression that implements a Turing machine, like those available in Perl).

To parse PDF you probably need to actually write a parser, probably a recursive-descent parser, and use that to turn a PDF into a collection of objects you can interrogate and manipulate. Fortunately PDF isn't actually very difficult to parse this way; it's not as trivial as Lisp S-expressions, but it's not as difficult as C.

-- Chris

--
Chris Hanson, bDistributed.com, Inc. | Email: email@hidden
Custom Mac OS X Development | Phone: +1-847-372-3955
http://bdistributed.com/ | Fax: +1-847-589-3738
http://bdistributed.com/Articles/ | Personal Email: email@hidden
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: limits of objects
      • From: Ben Dougall <email@hidden>
References: 
 >limits of objects (From: Ben Dougall <email@hidden>)

  • Prev by Date: RE: Icons in Help system
  • Next by Date: Re: Simple question: Why does this work?
  • Previous by thread: limits of objects
  • Next by thread: Re: limits of objects
  • Index(es):
    • Date
    • Thread