Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Parsing Large Text Files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Large Text Files

Subject: Re: Parsing Large Text Files
From: "Mark J. Reed" <email@hidden>
Date: Sat, 3 May 2008 08:29:25 -0400

So what'd you use to write out the Perl that didn't include linefeeds?
 (CRLF works, too, btw, just not bare CR.)



On 5/3/08, Nigel Garvey <email@hidden> wrote:
> Bruce Robertson wrote on Fri, 02 May 2008 08:21:36 -0700:
>
> >>[From me]
> >> Perl's obviously the way to go in your case, but I can't resist tinkering
> >> with the AS performance.  :)  This uses File Read/Write instead of the
> >> shell script, 'set' instead of 'copy', referenced list variables, and a
> >> more efficient way of inserting returns into the reversed text:
>
> >Yes, that't nice, gets down to about 22MB/minute.
> >
> >The perl script processes at about 800MB/minute by my rough test.
>
> I managed to knock another 25% off the running time of mine, but it's
> still no match for the perl. And somewhere between my 6MB test file size
> and the 80MB target, Script Editor crashes.
>
> I couldn't get the perl script to work yesterday: the output files were
> always empty. (No wonder it was fast!) The problem turned out this
> morning to be the line endings in the pl file. They _have_ to be line feeds.
>
> [OT] While optimising the AS version on a Jaguar machine, I used this
> handler to get round the old 4000-element extraction limit:
>
>   -- This uses a delimiter set elsewhere.
>   on getTextItems(theText)
>     set ti to {}
>     set n to (count theText's text items)
>     repeat with i from 1 to n by 3900
>       set j to i + 3899
>       if (j > n) then set j to n
>       set ti to ti & text items i thru j of theText
>     end repeat
>
>     return ti
>   end getTextItems
>
> On my Tiger machine, of course, this isn't necessary. But it turns out
> (with 30004 text items, at least) that it's about 2.5 times as fast on
> that machine as a single 'text items of ...' extraction!  :\
>
> NG
>
>  _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> AppleScript-Users mailing list      (email@hidden)
> Help/Unsubscribe/Update your Subscription:
> Archives: http://lists.apple.com/archives/applescript-users
>
> This email sent to email@hidden
>

--
Sent from Gmail for mobile | mobile.google.com

Mark J. Reed <email@hidden>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

References:
	>Re: Parsing Large Text Files (From: "Nigel Garvey" <email@hidden>)

Prev by Date: Re: Parsing Large Text Files
Next by Date: "Opening" application Mail from the Dock
Previous by thread: Re: Parsing Large Text Files
Next by thread: Determine if file is already open
Index(es):
- Date
- Thread