• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Parsing Large Text Files
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Large Text Files


  • Subject: Re: Parsing Large Text Files
  • From: "Mark J. Reed" <email@hidden>
  • Date: Thu, 1 May 2008 22:31:03 -0400

If my assumptions are correct, this Perl script should do the trick:

#!/usr/bin/perl
$/ = "\r\n>";
while (<>)
{
    s/^>?(.*?)\r\n(.*)\r\n>/$2/oms;
    my $name = $1;
    s/\r\n//ogms;
    $_ = reverse($_);
    s/.{50}/$&\r\n/ogms;
    print ">$name reversed:\r\n$_\r\n";
}

Save it as e.g. proteins.pl and run it thus:

$ perl proteins.pl inputfile >outputfile

OMM it takes about 12 seconds to process  a 98,172,928-byte file I
created by repeating your snippet.  Sample output:

>2007006285 Uroporphyrinogen-III decarboxylase [LWCv2] reversed:
AHNAYTEFSLNYMQKIVNGLIKTNDGRKVPSIVFTKGGGLWLEAQAEIGS
DALGLDWTVDIGSARKRVGDKVALQGNLDPAILLSTPEAIEKEVISVLAS
YGKGDGHVFNLGHGITQWTPPENAAAMLTAIRAHSQQYHV
>2007006286 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligas es II [LWCv2] reve
rsed:
VVEGQAQDWTTLCRTIDARAAAFRQQGLAAGDCVALRGRNSVELVLAYLA
ALQLGARVLPLNPQLPDAQLQPLLPALDIDWGWSEAGDHWPGPVRPLTSD
VAVATPVPTNPAVTWQPGAPATLTLTSGSSGLPKGVLHCAANHLASAAGL
LAALPFTAGDGWLLSLPLFHVSGQGIVWRWLLRGARLLLVAEGDLAQALA
GCSHASLVPTQLQRLLAQNASLPALQHVLLGGAAIPVALTQRAEQAGIHC
WCGYRLTEMASTVTAK
>2007006287  [LWCv2] reversed:
DPAQAHALQRLEVRAWGCLLEELLACCPPDADTALAPLAALARACQQEEV
GARPLFDEIEQRLRTLAGDL
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Parsing Large Text Files
      • From: "Mark J. Reed" <email@hidden>
References: 
 >Re: Parsing Large Text Files (From: "Mark J. Reed" <email@hidden>)
 >Re: Parsing Large Text Files (From: Bruce Robertson <email@hidden>)
 >Re: Parsing Large Text Files (From: "Mark J. Reed" <email@hidden>)
 >Re: Parsing Large Text Files (From: "Mark J. Reed" <email@hidden>)

  • Prev by Date: Re: Parsing Large Text Files
  • Next by Date: Re: Parsing Large Text Files
  • Previous by thread: Re: Parsing Large Text Files
  • Next by thread: Re: Parsing Large Text Files
  • Index(es):
    • Date
    • Thread