• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Parsing Large Text Files
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Large Text Files


  • Subject: Re: Parsing Large Text Files
  • From: Bruce Robertson <email@hidden>
  • Date: Fri, 02 May 2008 10:55:26 -0700

Thanks.

Your version isn't quite correct - lt leaves out the title line; and it is
quite slow, processing at about 2MB/minute.

My recent applescript version is about 10X as fast, about 22MB/minute.

Also, for some reason the font size of your post is enormous.

>
>> Yes, that't nice, gets down to about 22MB/minute.
>>
>> The perl script processes at about 800MB/minute by my rough test.
>>  
>
> Here's another pure applescript version, probably not as fast as either of
> those, but what the heck, is speed the only consideration?
>
> Will you ever need to repurpose or tweak the script? If so do you have time to
> master another language or do you want to depend on the kindness of strangers?
>
> ES
>
> set StartTics to the ticks
> set AppleScript's text item delimiters to ">"
> set proteinFile to alias "Macintosh
> HD:Users:edstockly:Desktop:Archive:parseme2.txt"
> set readInfo to read proteinFile
>
>
>
> set allProteinInfo to text items of readInfo
> set newData to {}
> set AppleScript's text item delimiters to ""
> repeat with thisProteinInfo in the rest of allProteinInfo
>   set newInfo to FixProtein(thisProteinInfo)
> set the end of newData to the newInfo
> end repeat
> set AppleScript's text item delimiters to return
> set newData to newData as text
> set resultFile to ((path to desktop as Unicode text) & "esPro.txt")
> try
>
>   set finalFile to (open for access file resultFile with write permission)
> on error
>   close access resultFile
> set finalFile to (open for access file resultFile with write permission)
> end try
> set eof of finalFile to 0
> write newData to finalFile
> close access finalFile
> set endTicks to the ticks
> tell application "TextEdit"
>   activate (open file resultFile)
> end tell
> return the endTicks - StartTics
> on FixProtein(thisInfo)
>  set thisProteinInfo to paragraphs of thisInfo
>  set thisProteinInfo to the rest of thisProteinInfo as text
> set thisProteinInfo to the reverse of every item of thisProteinInfo
> set stringSize to count of thisProteinInfo
> set segEnd to stringSize
>   set segStart to 1
>  set newString to {}
> repeat
>     if stringSize < 50 then
>        set the end of newString to items segStart thru segEnd of
> thisProteinInfo
>          exit repeat
>    else
>           set the end of newString to items segStart thru (segStart + 54) of
> thisProteinInfo
>         set the end of newString to return
>         set segStart to segStart + 50
>          set stringSize to stringSize - 50
>      end if
> end repeat
> set AppleScript's text item delimiters to ""
>   return thisProteinInfo as string
>
> end FixProtein
>
> =


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Parsing Large Text Files
      • From: "Gary (Lists)" <email@hidden>
References: 
 >Re: Parsing Large Text Files (From: Ed Stockly <email@hidden>)

  • Prev by Date: Re: Parsing Large Text Files
  • Next by Date: Re: Parsing Large Text Files
  • Previous by thread: Re: Parsing Large Text Files
  • Next by thread: Re: Parsing Large Text Files
  • Index(es):
    • Date
    • Thread