• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: is there a way to extract text from pdfs?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: is there a way to extract text from pdfs?


  • Subject: Re: is there a way to extract text from pdfs?
  • From: Marcel Weiher <email@hidden>
  • Date: Fri, 14 Mar 2003 21:19:25 +0100

On Thursday, March 13, 2003, at 11:09 Uhr, Ben Dougall wrote:

is there anyway to give your cocoa app the capability to extract text from already existing pdfs? strip out all the pdf related/embedded info and just get the human readable text out?

Install TextLightning ( http://www.metaobject.com/ ) It installs as a filter service that automagically converts PDF to RTF for you. In your code, you just have to use one of the "from RTF" methods in NSAttributedString. When given a PDF file, the OS X services system will invoke the filter service and hand you the converted RTF.

Incidentally, it's not really a matter of "stripping out" unneeded PDF info, it is more a task of reconstructing a text-flow from clues left in the PDF.

Marcel (creator of TextLightning)
--
Marcel Weiher Metaobject Software Technologies
email@hidden www.metaobject.com
Metaprogramming for the Graphic Arts. HOM, IDEAs, MetaAd etc.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

References: 
 >is there a way to extract text from pdfs? (From: Ben Dougall <email@hidden>)

  • Prev by Date: Re: is there a way to extract text from pdfs?
  • Next by Date: Re: Embedding perl ?
  • Previous by thread: Re: is there a way to extract text from pdfs?
  • Next by thread: Re: is there a way to extract text from pdfs?
  • Index(es):
    • Date
    • Thread