Re: Extracting Text from PDF
Re: Extracting Text from PDF
- Subject: Re: Extracting Text from PDF
- From: Philip Aker <email@hidden>
- Date: Wed, 29 Oct 2008 11:05:57 -0700
On 2008-10-29, at 10:57:28, Bill Janssen wrote:
To extract text, pdftotext is just the right tool.
Ok. But not my question. :-(
Bill
Philip Aker <email@hidden> wrote:
On 2008-10-29, at 10:28:33, Bill Janssen wrote:
Dan Doughtie <email@hidden> wrote:
Scripting Acrobat is pretty messy. Are there any command line apps
that can
extract text from a PDF similar to extracting text from a Word
document with
Textutil.
I use xpdf to do that; it includes a command-line app 'pdftotext'
which
will spit out the text of the document. I've written patches for
it, as
well, to spit out wordbox info and to spit out links in the doc.
Those
patches are included in the doceng-toolkit project on SourceForge.
Hey Bill,
I compiled xpdf3.0.2 out of the box just fine. However I'm not able
to
get the extra pdftohtml-0.40a to install because of some problem
which
I can't deduce easily because I'm not real familiar with the ins and
outs of autoconfig, make, etc.
Do you know how to get that module to integrate with xpdf on Mac OS
X?
Sounds like it would be a solution with better options than
pdftotext.
Philip Aker
echo email@hidden@nl | tr a-z@. p-za-o.@
Democracy: Two wolves and a sheep voting on lunch.
Philip Aker
echo email@hidden@nl | tr a-z@. p-za-o.@
Democracy: Two wolves and a sheep voting on lunch.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden