Re: Parsing PDF; looking to count number of markup annotations per re viewer (user) in Mac OS 9.2.2.
Re: Parsing PDF; looking to count number of markup annotations per re viewer (user) in Mac OS 9.2.2.
- Subject: Re: Parsing PDF; looking to count number of markup annotations per re viewer (user) in Mac OS 9.2.2.
- From: Shane Stanley <email@hidden>
- Date: Fri, 19 Dec 2003 08:43:32 +1100
On Dec 19, 2003, at 1:59 AM, Fox, Christopher B wrote:
I've taken a gander at the PDF 1.5 spec from Adobe's website, and I
came up with the following string of Unix shell commands that seem to
work
great in Mac OS X 10.3.2:
strings test.pdf | awk '/\/Type \/Annot/, /endobj/' | grep '/T ' |
sort |
uniq -c
Essentially what this does is:
1) Eliminate binary data, and put the resulting strings on separate
lines.
2) Filter out lines not between "/Type /Annot" and "endobj" pairs. This
eliminates anything not an Annotation object.
3) Filter once again, selecting only lines that begin with '/T ' which
is
the markup annotation key that indicates the user or reviewer who made
the
annotation.
4) Sort the remaining lines.
5) Count the number of times each unique line occurs, and print the
results.
Unfortunately, our user population is still on Mac OS 9.2.2, so Unix
solutions aren't an option. I know AppleScript has direct file access
capability (using open for access, read, write, close access, etc.),
but I
can't seem to get the equivalent of "strings" or the awk pattern
recognition
out of AppleScript.
Any thoughts would be most appreciated.
You'll probably need to take multiple passes at it using text item
delimiters. It's a bit tedious, but reasonably fast (I did something
similar to parse out the number of pages in a PDF).
--
Shane Stanley <email@hidden>
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.