Re: Any scriptable apps to get text from pdfs under X 10.1?
Re: Any scriptable apps to get text from pdfs under X 10.1?
- Subject: Re: Any scriptable apps to get text from pdfs under X 10.1?
- From: Simon Coles <email@hidden>
- Date: Sat, 10 Nov 2001 19:37:21 -0000
I would like to be able to grab a line of text from a pdf file (in order
to change the name of a file to something reflecting its content).
While X imaging is closely linked to pdf, the TextEdit app can't open
pdfs, and the preview app has no dictionary. Anyone know of a scriptable
app that allow access to pdf text (i.e., "get word 3 of line 1 of <my
pdf> ")?
There are a number of Open Source libraries which can extract text from
PDF; for example, I found pdftotext on my Linux box.
http://freshmeat.net
is also a good place to search, there's a lot of PDF manipulation stuff
about.
Of course, installing this on a MacOS X box may require a fair amount of
Unix knowledge, and I don't know how to get AppleScript to call a Unix
command line - but I assume that's possible.
Getting more Unix'y and less Mac-like, there are libraries for
Perl/Python/Java (Python's my preference) for generating & manipulating
PDFs.
Just wanted to point out a solution, even if its a painful one :-(
Simon
----------------------------------------------------------------------
Simon J. Coles Email: email@hidden
CEO Phone: +44 1344 753703
Amphora Research Systems Ltd
http://www.amphora-research.com/
=============== Life is too precious to take seriously ===============