• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Extracting Text from PDF
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extracting Text from PDF


  • Subject: Re: Extracting Text from PDF
  • From: Philip Aker <email@hidden>
  • Date: Wed, 29 Oct 2008 11:05:57 -0700

On 2008-10-29, at 10:57:28, Bill Janssen wrote:

To extract text, pdftotext is just the right tool.

Ok. But not my question. :-(


Bill

Philip Aker <email@hidden> wrote:

On 2008-10-29, at 10:28:33, Bill Janssen wrote:

Dan Doughtie <email@hidden> wrote:

Scripting Acrobat is pretty messy. Are there any command line apps
that can
extract text from a PDF similar to extracting text from a Word
document with
Textutil.

I use xpdf to do that; it includes a command-line app 'pdftotext'
which
will spit out the text of the document.  I've written patches for
it, as
well, to spit out wordbox info and to spit out links in the doc.
Those
patches are included in the doceng-toolkit project on SourceForge.

Hey Bill,

I compiled xpdf3.0.2 out of the box just fine. However I'm not able to
get the extra pdftohtml-0.40a to install because of some problem which
I can't deduce easily because I'm not real familiar with the ins and
outs of autoconfig, make, etc.


Do you know how to get that module to integrate with xpdf on Mac OS X?
Sounds like it would be a solution with better options than pdftotext.



Philip Aker echo email@hidden@nl | tr a-z@. p-za-o.@

Democracy: Two wolves and a sheep voting on lunch.



Philip Aker
echo email@hidden@nl | tr a-z@. p-za-o.@

Democracy: Two wolves and a sheep voting on lunch.

_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
References: 
 >Extracting Text from PDF (From: Dan Doughtie <email@hidden>)
 >Re: Extracting Text from PDF (From: Bill Janssen <email@hidden>)
 >Re: Extracting Text from PDF (From: Philip Aker <email@hidden>)
 >Re: Extracting Text from PDF (From: Bill Janssen <email@hidden>)

  • Prev by Date: Re: Extracting Text from PDF
  • Next by Date: Re: Ejecting An Image Disk
  • Previous by thread: Re: Extracting Text from PDF
  • Next by thread: Sturm, Lewis is out of the office.
  • Index(es):
    • Date
    • Thread