• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: collectdata
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: collectdata


  • Subject: Re: collectdata
  • From: Christopher Stone <email@hidden>
  • Date: Mon, 06 Mar 2017 14:45:35 -0600

On Mar 06, 2017, at 06:41, Yvan KOENIG <email@hidden> wrote:
No need for a third party tool to extract text from a PDF. This script delivered by Shane STANLEY doe the job.


That is indeed useful – IF you only need the RAW output.

The ASObjC code returns RAW output and the layout of the pdf is significantly mangled.

It's the same as pdftotext's -raw output.

pdftotext -raw "/path/to/your/File.pdf" -

The primary reason to use pdftotext instead of other tools is its ability to preserve the fidelity of the PDF file's layout.  (It's the only tool I personally know of that does this.)

It's not perfect, but it can make parsing a PDF document's text relatively easy instead of difficult to impossible.

pdftotext -layout "/path/to/your/File.pdf" -

--
Best Regards,
Chris

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: collectdata
      • From: Thomas Fischer <email@hidden>
    • Re: collectdata
      • From: Shane Stanley <email@hidden>
References: 
 >collectdata (From: Julien Battist <email@hidden>)
 >Re: collectdata (From: Shane Stanley <email@hidden>)
 >Re: collectdata (From: Thomas Fischer <email@hidden>)
 >Re: collectdata (From: Julien Battist <email@hidden>)
 >Re: collectdata (From: Yvan KOENIG <email@hidden>)

  • Prev by Date: Re: Calling Perl from AppleScript
  • Next by Date: Re: Calling Perl from AppleScript
  • Previous by thread: Re: collectdata
  • Next by thread: Re: collectdata
  • Index(es):
    • Date
    • Thread