Re: Unique Items in a text file
Re: Unique Items in a text file
- Subject: Re: Unique Items in a text file
- From: Caleb Strockbine <email@hidden>
- Date: Wed, 10 Apr 2002 17:19:09 -0400
On Monday, April 8, 2002, at 08:27 AM, Steve Thompson wrote:
Does anyone know of a better way to achieve this list of unique values
either by script or using an OSAX or external application? As the number
of
records increases in the text file, the script takes longer and longer.
In MacOS X, you can use awk to filter the file. Here's an awk filter
that will print only the distinct values of field 8 from a tab-delimited
file:
# awk program for finding distinct values of field 8 in a tab delimited
file
BEGIN { FS = "[\t]" }
{ if (!($8 in distinctCodes)) {
distinctCodes[$8] = true
print $0 }
}
If you put the above text into a file called 'filter.awk', you can run it
on a data file called 'sample.data' with the following shell command:
awk -f filter.awk sample.data
You can redirect the output of the filter into a file, and then read the
file and do whatever additional processing is necessary from your
AppleScript or from some shell script.
I ran the filter above on a sample file with 15000 lines and a dozen
distinct values in field 8, and it finished in 0.5 seconds.
Note: I'm using the word "distinct" here instead of unique, because
the filter will print one line for each value that occurs in field 8, no
matter how many times that value occurs in the file. To my thinking,
"unique" would imply that the filter printed those values which
occur exactly once.
If you'd like to print *only* the values of field 8 and not the entire
line of the first occurance of each value, change the "print $0"
statement to "print $8"
Hope that helps.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.