• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to match data in two different text files as fast as possible?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to match data in two different text files as fast as possible?


  • Subject: Re: How to match data in two different text files as fast as possible?
  • From: has <email@hidden>
  • Date: Sun, 13 Aug 2006 01:23:52 +0100

Richard Rönnbäck wrote:

I need to match data from two different text files, but none of the
techniques I know are fast enough,

AppleScript's not the best language for text crunching, and if speed is an issue you'd be better off using Perl, Python, etc. e.g. Below's a Python version, which'll hopefully be fast enough. (Perl'd be quicker still, but my Perl's rubbish.:p)


#!/usr/bin/python

import re, sys

maintablepath, idtablepath, outtablepath = sys.argv[1:]

# make a lookup table for unique ids by file path
idtablepattern = re.compile('^(.+?)\t(.+?)$', re.MULTILINE)

idtable = dict(idtablepattern.findall(file(idtablepath).read()))

# write each line in the main table to the out table, appending either unique id or 'N/A'
infile = file(maintablepath)
outfile = file(outtablepath, 'w')


line = infile.readline()
while line:
path = line.split('\t', 1)[0]
outfile.write('%s\t%s\n' % (line.rstrip('\n\r'), idtable.get(path, 'N/A')))
line = infile.readline()



Save the above script as 'append.py', then run from the command line using:


python /path/to/append.py /path/to/maintable.txt /path/to/ idtable.txt /path/to/outtable.txt

The above script assumes that both files use the same text encoding, and if the paths are Unicode that they're decomposed in the same way. Also, path comparisons are case-sensitive. If these are assumptions are unsafe, it's not hard to modify the script to suit but you'll need to specify your requirements in more detail.

HTH

has
--
http://freespace.virgin.net/hamish.sanderson/


_______________________________________________ Do not post admin requests to the list. They will be ignored. Applescript-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: This email sent to email@hidden
  • Follow-Ups:
    • Re: How to match data in two different text files as fast as possible?
      • From: Richard Rönnbäck <email@hidden>
  • Prev by Date: Re: [ANN] Usable Keychain Scripting
  • Next by Date: Re: How to 'wait' for an entourage schedule event to complete?
  • Previous by thread: PDF all pages as single pages script. - how to choose (a) document(s)?
  • Next by thread: Re: How to match data in two different text files as fast as possible?
  • Index(es):
    • Date
    • Thread