• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: What is Best Method To Determine Duplicate Items in a Large List?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is Best Method To Determine Duplicate Items in a Large List?


  • Subject: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • From: 2551phil <email@hidden>
  • Date: Sat, 4 Nov 2017 12:17:54 +0700

Hope that you’re feeling better, Chris. :)

If the list is in a text file, then another option for BBEdit users:

tell application "BBEdit"
        set theLines to sort lines of its front document
        set theResult to process duplicate lines theLines
end tell


That’s the simplest form, but the dictionary gives you a ton of other options
for the sort and process duplicate lines commands, both in terms of what you
can match and what you can do with the results.


Best


Phil
@sqwarq






> On 4 Nov 2017, at 06:27, Christopher Stone <email@hidden>
> wrote:
>
> Hey Folks,
>
> As some of you know I was laid up all summer with back problems and illness,
> so I'm quite late to this particular party.
>
> I'm just about caught up with all my email now, so this message should be the
> last non-sequiter (at least on the ASUL).
>
>
> One solution that hasn't been mentioned is the shell.
>
> Since we're not dealing with any UTF8 characters it's by far the simplest
> solution.
>
> Running on a 30,000 line test file with 30 digit number strings.
>
> sort -n Test_List.txt | uniq -d
>
> Time test:
>
> Minerva:Downloads chris$ time sort -n Test_List.txt | uniq -d
>
> 000000000000000000000000000001
> 000000000000000000000000029681
> 000000000000000000000000030000
>
> real  0m0.108s
> user  0m0.103s
> sys   0m0.011s
>
>
> ------------------------------------------------------------------------------
>
> set testFilePath to "~/Downloads/Test_List.txt"
> tell application "System Events" to set testFilePath to POSIX path of disk
> item testFilePath
>
> set shCMD to "sort -n " & quoted form of testFilePath & " | uniq -d"
> set listOfDupes to do shell script shCMD
>
> ------------------------------------------------------------------------------
>
> The AppleScript runs from FastScripts in just over 0.09 seconds on my old
> Mid2010 i7 MacBook Pro.
>
> --
> Best Regards,
> Chris
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> AppleScript-Users mailing list      (email@hidden)
> Help/Unsubscribe/Update your Subscription:
> Archives: http://lists.apple.com/archives/applescript-users
>
> This email sent to email@hidden

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

References: 
 >Re: What is Best Method To Determine Duplicate Items in a Large List? (From: Christopher Stone <email@hidden>)

  • Prev by Date: Re: Help with terminal script
  • Next by Date: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Previous by thread: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Next by thread: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Index(es):
    • Date
    • Thread