• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Parsing Large Text Files
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Large Text Files


  • Subject: Re: Parsing Large Text Files
  • From: Ed Stockly <email@hidden>
  • Date: Fri, 2 May 2008 09:24:40 -0700


Yes, that't nice, gets down to about 22MB/minute.

The perl script processes at about 800MB/minute by my rough test.
 

Here's another pure applescript version, probably not as fast as either of those, but what the heck, is speed the only consideration?

Will you ever need to repurpose or tweak the script? If so do you have time to master another language or do you want to depend on the kindness of strangers?

ES

set StartTics to the ticks
set AppleScript's text item delimiters to ">"
set proteinFile to alias "Macintosh HD:Users:edstockly:Desktop:Archive:parseme2.txt"
set readInfo to read proteinFile



set allProteinInfo to text items of readInfo
set newData to {}
set AppleScript's text item delimiters to ""
repeat with thisProteinInfo in the rest of allProteinInfo
set newInfo to FixProtein(thisProteinInfo)
set the end of newData to the newInfo
end repeat
set AppleScript's text item delimiters to return
set newData to newData as text
set resultFile to ((path to desktop as Unicode text) & "esPro.txt")
try


set finalFile to (open for access file resultFile with write permission)
on error
close access resultFile
set finalFile to (open for access file resultFile with write permission)
end try
set eof of finalFile to 0
write newData to finalFile
close access finalFile
set endTicks to the ticks
tell application "TextEdit"
activate (open file resultFile)
end tell
return the endTicks - StartTics
on FixProtein(thisInfo)
set thisProteinInfo to paragraphs of thisInfo
set thisProteinInfo to the rest of thisProteinInfo as text
set thisProteinInfo to the reverse of every item of thisProteinInfo
set stringSize to count of thisProteinInfo
set segEnd to stringSize
set segStart to 1
set newString to {}
repeat
if stringSize < 50 then
set the end of newString to items segStart thru segEnd of thisProteinInfo
exit repeat
else
set the end of newString to items segStart thru (segStart + 54) of thisProteinInfo
set the end of newString to return
set segStart to segStart + 50
set stringSize to stringSize - 50
end if
end repeat
set AppleScript's text item delimiters to ""
return thisProteinInfo as string


end FixProtein

=
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Parsing Large Text Files
      • From: Bruce Robertson <email@hidden>
    • Re: Parsing Large Text Files
      • From: "Mark J. Reed" <email@hidden>
References: 
 >Re: Parsing Large Text Files (From: Bruce Robertson <email@hidden>)

  • Prev by Date: Re: Parsing Large Text Files
  • Next by Date: Re: Parsing Large Text Files
  • Previous by thread: Re: Parsing Large Text Files
  • Next by thread: Re: Parsing Large Text Files
  • Index(es):
    • Date
    • Thread