Re: Working with big lists
Re: Working with big lists
- Subject: Re: Working with big lists
- From: has <email@hidden>
- Date: Sun, 22 May 2005 21:40:15 +0100
Rob Stott wrote:
>I have a list in a text (.txt) file. I want to find out how many
>times each line occurs in the text file.
[...]
>I was wondering whether anyone had any clever ideas for doing
>this a bit more quickly.
Use a more efficient algorithm. You should only need to scan each line in the file once, whereas your current routine scans each line N times (where N is the number of lines in the file) - horribly inefficient.
The trick is to read the file one line at a time, using a dictionary object (aka 'hash' or 'associative array') to keep count of the number of times a given string is found. Example:
#!/usr/bin/python
f = file('/Users/has/test.txt') # [your path here]
d = {}
line = True
while line:
line = f.readline()
s = line.rstrip('\r\n')
d[s] = d.get(s, 0) + 1
f.close()
lst = d.items()
lst.sort(lambda a, b: cmp(b[1], a[1]))
print '\n'.join(['%s\t%s' % (a, b) for a, b in lst])
HTH
has
--
http://freespace.virgin.net/hamish.sanderson/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden