Re: What is Best Method To Determine Duplicate Items in a Large List?
Re: What is Best Method To Determine Duplicate Items in a Large List?
- Subject: Re: What is Best Method To Determine Duplicate Items in a Large List?
- From: Christopher Stone <email@hidden>
- Date: Fri, 3 Nov 2017 18:27:44 -0500
Hey Folks,
As some of you know I was laid up all summer with back problems and illness, so
I'm quite late to this particular party.
I'm just about caught up with all my email now, so this message should be the
last non-sequiter (at least on the ASUL).
One solution that hasn't been mentioned is the shell.
Since we're not dealing with any UTF8 characters it's by far the simplest
solution.
Running on a 30,000 line test file with 30 digit number strings.
sort -n Test_List.txt | uniq -d
Time test:
Minerva:Downloads chris$ time sort -n Test_List.txt | uniq -d
000000000000000000000000000001
000000000000000000000000029681
000000000000000000000000030000
real 0m0.108s
user 0m0.103s
sys 0m0.011s
------------------------------------------------------------------------------
set testFilePath to "~/Downloads/Test_List.txt"
tell application "System Events" to set testFilePath to POSIX path of disk item
testFilePath
set shCMD to "sort -n " & quoted form of testFilePath & " | uniq -d"
set listOfDupes to do shell script shCMD
------------------------------------------------------------------------------
The AppleScript runs from FastScripts in just over 0.09 seconds on my old
Mid2010 i7 MacBook Pro.
--
Best Regards,
Chris
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden