Re: pdftotext
Re: pdftotext
- Subject: Re: pdftotext
- From: Shane Stanley <email@hidden>
- Date: Sat, 21 Dec 2013 09:28:46 +1100
On 21 Dec 2013, at 8:08 AM, Christopher Stone <email@hidden> wrote:
Is there a better way to script the extraction of text from an unlocked pdf?
That depends on what you mean by "better". if it covers without recourse to third-party software, you could use AppleScript:
-- put in ASObjC-based script library use framework "Foundation" use framework "Quartz"
on textInPDF:thePath set theText to current application's NSMutableString's |string|() set anNSURL to current application's NSURL's fileURLWithPath:thePath set theDoc to current application's PDFDocument's alloc()'s initWithURL:anNSURL set theCount to theDoc's pageCount() as integer repeat with i from 1 to theCount set thePage to (theDoc's pageAtIndex:(i - 1)) (theText's appendString:(thePage's |string|())) end repeat return theText as text end textInPDF:
Of course I'm using the Satimage.osax's regex engine to do the heavy lifting.
<scratched_record> Or you could use AppleScript...
|
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden
References: | |
| >pdftotext (From: Christopher Stone <email@hidden>) |