• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: What is Best Method To Determine Duplicate Items in a Large List?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is Best Method To Determine Duplicate Items in a Large List?


  • Subject: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • From: Jim Weisbin <email@hidden>
  • Date: Sun, 13 Aug 2017 08:59:26 -0400

Jim Underwood <email@hidden> wrote:

> Anyone have any bright ideas and/or tools to speed up identification of dup
> text items in a large list (~30,000)?

I assume you are looking for for consecutive repeated words?

It’s easy with egrep:

egrep "\b([a-zA-Z0-9]+) \1\b” test.txt

However, my attempts to do this with ASObjC using NSRegularExpression have so
far failed. I think the \b (word boundary) and \1 (repeated item) have to be
escaped, not sure.

One caveat is that this will find legitimately repeated words, such as, ‘It’s
true that that is the case…"



Jim Weisbin | C.T.O. | Human | Post Human | 27 West 20th Street | Suite 801 |
New York, NY | 10011 |  (212) 352-0211 |  (917) 375-2272 | 2046 Broadway |
Santa Monica, CA | 90404 |  (310) 264-0211 telephone | www.humanworldwide.com
<http://www.humanworldwide.com/>
Click here <http://www.humanworldwide.com/#commercials> to view our online reel


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Prev by Date: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Next by Date: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Previous by thread: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Next by thread: Re: What is Best Method To Determine Duplicate Items in a Large List?
  • Index(es):
    • Date
    • Thread