Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: How to match data in two different text files as fast as possible?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to match data in two different text files as fast as possible?

Subject: Re: How to match data in two different text files as fast as possible?
From: has <email@hidden>
Date: Sun, 13 Aug 2006 01:23:52 +0100

Richard Rönnbäck wrote:

I need to match data from two different text files, but none of the
techniques I know are fast enough,

AppleScript's not the best language for text crunching, and if speed is an issue you'd be better off using Perl, Python, etc. e.g. Below's a Python version, which'll hopefully be fast enough. (Perl'd be quicker still, but my Perl's rubbish.:p)

#!/usr/bin/python

import re, sys

maintablepath, idtablepath, outtablepath = sys.argv[1:]

# make a lookup table for unique ids by file path
idtablepattern = re.compile('^(.+?)\t(.+?)$', re.MULTILINE)

idtable = dict(idtablepattern.findall(file(idtablepath).read()))

# write each line in the main table to the out table, appending either unique id or 'N/A' infile = file(maintablepath) outfile = file(outtablepath, 'w')

line = infile.readline() while line: path = line.split('\t', 1)[0] outfile.write('%s\t%s\n' % (line.rstrip('\n\r'), idtable.get(path, 'N/A'))) line = infile.readline()

Save the above script as 'append.py', then run from the command line using:

python /path/to/append.py /path/to/maintable.txt /path/to/ idtable.txt /path/to/outtable.txt

The above script assumes that both files use the same text encoding, and if the paths are Unicode that they're decomposed in the same way. Also, path comparisons are case-sensitive. If these are assumptions are unsafe, it's not hard to modify the script to suit but you'll need to specify your requirements in more detail.

HTH

has
--
http://freespace.virgin.net/hamish.sanderson/


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: How to match data in two different text files as fast as possible?
From: Richard Rönnbäck <email@hidden>


Prev by Date:
Re: [ANN] Usable Keychain Scripting

Next by Date:
Re: How to 'wait' for an entourage schedule event to complete?

Previous by thread:
PDF all pages as single pages script. - how to choose (a)	document(s)?

Next by thread:
Re: How to match data in two different text files as fast as possible?

Index(es):

Date
Thread