Re: Need collective genius

Subject: Re: Need collective genius
From: Paul Abney <email@hidden>
Date: Tue, 22 Mar 2011 08:38:59 -0500

Ray,

Page 1 of ? really is a text string. I can readily identify the pages that have it and put that page number in a list.

In fact I do have a working script, but it seems to be a bit of a kludge. As Christian mentioned it also involves Acrobat. Here are the basic steps for the working script (tested on a small 500 page document):

Open pdf in skim

get text of page, and look for "Page 1 of 1"

If found add page number to single_pg_list

if not add page number to multiple_pg_list

close pdf and open in acrobat

delete all pages in multiple_pg_list (in reverse order to avoid changing the following page numbers)

save document as single.pdf

Then do the same for the single_pg_list to get the multiple.pdf

Again this does work, however I need to address a couple of things. Some of these pdfs could be as much as 50,000 pages. So the page list is very long. Is there a limit to the size of an applescript list?

Also this is very slow. The acrobat command to delete pages is

delete pages reference first integer last integer

since my list is of individual pages it deletes pages one at a time. This is very time consuming. In my testing deleting 250 pages from a 500 page document takes about 17 seconds on my machine. However, deleting a 250 page range takes less than a second. So my next problem is how do I take a list of page numbers and sort them into start and end page numbers?

If I have the following list

set thelist to {1, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24, 25, 26, 27, 28, 29, 30}

and I could sort them into two list, a starting page and and ending page, I could delete ranges when possible. So I would end up with:

the_starting_pages {1, 4, 12, 22}

the_ending_pages {1, 10, 19, 30}

Not sure how to do this, If I should do this, or even if I am barking up the wrong tree.

Thanks for the help guys.

Paul

On Mar 21, 2011, at 6:57 PM, Ray Gonzalez wrote:

Paul;

Can we look at this problem from a different view? I'm not sure that "Page 1 of ?" really represents text or strings in your document.

Aren't they really dynamic underlying code fragments which are constantly responding to changes the User makes in the total number of pages and which page has the focus?

It would seem that any attempt you make to grab certain pages, will immediately activate the paging code... and it will begin trying to adjust the very numbers your script depends on.

I'm not familiar with Adobe coding but somehow, there must be a way, to first convert those 4,000 pages of paging code to hard text. Then, and only then; whatever Search and Copy routine you desire should be straightforward.

Ray

Do not post admin requests to the list. They will be ignored. AppleScript-Users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: Archives: http://lists.apple.com/archives/applescript-users This email sent to email@hidden

References:

>Re: Need collective genius (From: Paul Abney <email@hidden>)

>Re: Need collective genius (From: Ray Gonzalez <email@hidden>)

Prev by Date: Re: Need collective genius

Next by Date: Getting a list of applications for a file type via LaunchServices

Previous by thread: Re: Need collective genius

Next by thread: Re: Need collective genius

Index(es):

Date

Thread