Re: Need a faster Find Duplicates Routine
Re: Need a faster Find Duplicates Routine
- Subject: Re: Need a faster Find Duplicates Routine
- From: "Johnny AppleScript" <email@hidden>
- Date: Wed, 28 May 2003 17:25:47 -0600
On 03/05/28 03:58 PM, "Christopher Nebel" <email@hidden> wrote:
>
If you can assume that you're looking
>
for duplicates in a particular playlist, you can improve the speed
>
quite a bit:
LOL! I never know whether I learn more by struggling for two days,
inventing all kinds of obfuscated workarounds based on my lack of knowledge,
or waiting for a real expert to show me what I needed using less a fraction
of the code.
Here's what I ended up with to date, which isn't too awfully slow (this
whole thing is slow due to running it over a networked-library), as long as
fixed indexing is applied; there's some unneeded repetition which slows it
down a bit that I was in the process of refining (please don't laugh too
hard):
[presort lists to 27 sublists using alphanumeric sorting; ~1 minute 3K
tracks; <30 seconds 500 tracks]
set dupesList to {}
repeat with i from 1 to number of items in allLists
set targetList to item i of allLists
if targetList is not {} then
set checkList to {}
set locationCheckList to {}
set setID to name of item 1 of targetList
set setID to character 1 of setID
try
say "Searching in tracks beginning with the character " & setID
on error
say "Searching"
end try
repeat with i from 1 to number of items in targetList
set theTrackName to the name of item i of targetList
set trackC2 to character -1 of theTrackName
if theTrackName is in checkList then
repeat with i from i - 1 to number of items in targetList
set dupeTrack to item i of targetList
set dupeName to the name of dupeTrack as string
set dupeC2 to character -1 of dupeName
if dupeC2 is not trackC2 then exit repeat
if the name of dupeTrack is theTrackName then
set dupLoc to location of dupeTrack
if dupLoc = "missing value" then
my missingLink()
else
if dupLoc is not in locationCheckList then
set the end of locationCheckList to dupLoc
set the end of dupesList to dupeTrack
log (dupeTrack)
log (current date)
beep
end if
end if
end if
end repeat
else
copy theTrackName to the end of checkList
end if
end repeat
set targetList to {}
end if
end repeat
-- ~ 5 minutes for 3k tracks/ 300+ duplicates G3-533/1.5GB over a 10/100
network
Your fastest AS-only example still takes considerably longer (nearly 5x) on
my machine using only 500 tracks/ 102 duplicates, but I assume that is due
not only to CPU, but more duplicates and running over a network.
As for broken 'Get Every', I locked that in my brain from a sample script
notation from iTunes 2.03 (by Sal?) which stated it was broken, had the 'Get
Every' string commented out, and provided a workaround similar to mine.
I was obviously just formatting my 'get every' syntax wrong, and assumed it
was still broken per an expert's prior comments.
May I incorporate your (credited) code into a verbose version for uploading
to share points?
Cheers
JA
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.