Re: Unique Items in a text file
Re: Unique Items in a text file
- Subject: Re: Unique Items in a text file
- From: has <email@hidden>
- Date: Mon, 8 Apr 2002 23:45:08 +0100
Steve Thompson wrote:
>
On a daily basis I receive a text file that contains a large number of tab
>
delimited records. In each record, field 8 contains a product code.
>
>
I have written a script that loops through each line of data, looks at field
>
8, compares it with a list of product codes and, if the product code isn't
>
in the list, it adds it.
Purely as an exercise in seeing how well vanilla code can perform, I found
the following ran through a 10,000 line, 4.4MB test string in 15 secs on my
G3/300.
======================================================================
on _extract(theString, columnIndex, resultList) --private stuff
repeat with y from 1 to count theString's paragraphs
theString's paragraph y's text item columnIndex
if result is not in resultList then
set resultList's end to result
end if
end repeat
end _extract
--
on extractFromColumn(theString, columnIndex) --public call
set resultList to {}
set oldTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to tab
set x to -99
repeat with x from 1 to ((count theString's paragraphs) div 100) *
[NO-BREAK]100 by 100
_extract(theString's text (paragraph x) thru (paragraph (x +
[NO-BREAK]99)), columnIndex, resultList)
end repeat
if (count theString's paragraphs) mod 100 is not 0 then
[NO-BREAK]_extract(theString's text (paragraph (x + 100)) thru -1,
[NO-BREAK]columnIndex, resultList)
set AppleScript's text item delimiters to oldTID
resultList
end extractFromColumn
======================================================================
[formatted using ScriptToEmail - gentle relief for mailing list pains]
[
http://files.macscripter.net/ScriptBuilders/ScriptTools/ScriptToEmail.hqx]
Behaviour seems to be linear, at least as far as I tested. The trick seems
to be in working on smaller [100 line] chunks rather than the entire string
all at once; the _extract() routine bogs down badly otherwise as string
size/paragraph count (?) increases.
Be interested to hear how osaxen/application based alternatives do by
comparison.
has
--
http://www.barple.connectfree.co.uk/ -- The Little Page of Beta AppleScripts
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.