• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Extracting Text from PDF
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extracting Text from PDF


  • Subject: Re: Extracting Text from PDF
  • From: Bill Janssen <email@hidden>
  • Date: Wed, 29 Oct 2008 10:57:28 PDT
  • Comments: In-reply-to Philip Aker <email@hidden> message dated "Wed, 29 Oct 2008 10:46:17 -0700."

To extract text, pdftotext is just the right tool.

Bill

Philip Aker <email@hidden> wrote:

> On 2008-10-29, at 10:28:33, Bill Janssen wrote:
>
> > Dan Doughtie <email@hidden> wrote:
> >
> >> Scripting Acrobat is pretty messy. Are there any command line apps
> >> that can
> >> extract text from a PDF similar to extracting text from a Word
> >> document with
> >> Textutil.
>
> > I use xpdf to do that; it includes a command-line app 'pdftotext'
> > which
> > will spit out the text of the document.  I've written patches for
> > it, as
> > well, to spit out wordbox info and to spit out links in the doc.
> > Those
> > patches are included in the doceng-toolkit project on SourceForge.
>
> Hey Bill,
>
> I compiled xpdf3.0.2 out of the box just fine. However I'm not able to
> get the extra pdftohtml-0.40a to install because of some problem which
> I can't deduce easily because I'm not real familiar with the ins and
> outs of autoconfig, make, etc.
>
> Do you know how to get that module to integrate with xpdf on Mac OS X?
> Sounds like it would be a solution with better options than pdftotext.
>
>
> Philip Aker
> echo email@hidden@nl | tr a-z@. p-za-o.@
>
> Democracy: Two wolves and a sheep voting on lunch.
>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Extracting Text from PDF
      • From: Philip Aker <email@hidden>
References: 
 >Extracting Text from PDF (From: Dan Doughtie <email@hidden>)
 >Re: Extracting Text from PDF (From: Bill Janssen <email@hidden>)
 >Re: Extracting Text from PDF (From: Philip Aker <email@hidden>)

  • Prev by Date: Re: Extracting Text from PDF
  • Next by Date: Re: Extracting Text from PDF
  • Previous by thread: Re: Extracting Text from PDF
  • Next by thread: Re: Extracting Text from PDF
  • Index(es):
    • Date
    • Thread