Re: Sorting IP addresses - Progress
Re: Sorting IP addresses - Progress
- Subject: Re: Sorting IP addresses - Progress
- From: Graff <email@hidden>
- Date: Sat, 20 Dec 2003 16:43:54 -0500
The sort I provided is basically a merge sort. A merge sort operates
as O(n log n), where a shell sort operates as O(n^2). This means that
given 2 lists, one with 10 items and one with 1000 items the merge sort
would take 100 log 100 times, or 200 times longer to sort the 1000 item
list than the 10 item list. The shell sort would take 100^2, or 10,000
times longer to sort the 1000 item list than the 10 item list. So for
long lists the merge sort is vastly faster than the shell sort.
You can see the different types of sorts here:
<
http://linux.wku.edu/~lamonml/algor/sort/sort.html>
- Ken
On Dec 20, 2003, at 3:00 PM, Marconi wrote:
Here's where I stand thus far for sorting IP addresses in AS.
The initial challenge is getting the addresses into a sortable format.
I'm using a list of lists. Each list item consists of the raw IP
address and a second version of the address in 'sortable' format. The
intention is to sort on the second version but output the raw version
after the sort.
I tried converting IPs to large numbers. Also tried it by padding each
address to add leading zeros to any quad with a length of less than 3.
Surprisingly, padding the quads is not much faster in preparing the
list to be sorted.
When it comes to actually sorting, however, the shell sort I'm using
is really slow for large lists (1000+ IPs). Bear in mind that the
following code is sorting a list of two-item lists, based on the
second item in each two item list. Each list item looks like
{24.50.216.61, 024.050.216.061} and so forth. The second entry is the
raw IP padded with leading zeros. The shell sort code is this:
to ShellSort(aList)
copy {aList, 0, 0, (aList's length) - 1} to {newList, Swaps, Comps,
theLength}
set Gap to theLength
repeat while Gap is greater than 1
set {Gap, Flag} to {Gap div 2, true}
repeat while Flag is true
set Flag to false
repeat with Index1 from 1 to (theLength - Gap)
set {Index2, Comps} to {Index1 + Gap, Comps + 1}
-- if Comps mod 1000 is 0 then log Comps
if item 2 of item Index1 of newList > item 2 of item Index2 of
newList then
set {Swaps, newList's item Index1, newList's item Index2, Flag} ,
to {Swaps + 1, newList's item Index2, newList's item Index1,
true}
end if
end repeat
end repeat
end repeat
-- log newList
return newList
end ShellSort
Thanks to Steve Mills for patiently explaining AS list handling and
providing the ShellSort code.
Now then, sorting my list of lists as above takes a really, really
long time with 2000 IP addresses.
The same 2000 addresses sort much more quickly when done with the code
below. It looks like maybe a bubble sort.
on sort(listToSort)
if length of listToSort = 1 then return listToSort
set firstItem to item 1 of listToSort
set restOfList to rest of listToSort
set list1 to {}
set list2 to {}
repeat with currItem in restOfList
if firstItem < currItem then
set list2 to list2 & currItem
else
set list1 to list1 & currItem
end if
end repeat
if length of list1 > 1 then set list1 to sort(list1)
if length of list2 > 1 then set list2 to sort(list2)
return list1 & firstItem & list2
end sort
Graff supplied the above code.
So I'm wondering, is it the fact that the faster code is sorting a
single list while the slower is sorting a list of lists? Or is it that
shell sorts become really inefficient with larger numbers of items to
be sorted? I seem to recall from years ago that the number of compares
increases exponentially in a shell sort as the number of items
increases.
I'd like to try the latter code -- the one I think may be a bubble
sort -- on the list of lists. That is, sorting on the second item of
each item in the list of entries list this: {24.50.216.61,
024.050.216.061}. I'm not sure how to change the latter code to do
that. Someone...?
BTW, I've considered using just one list (not the list of lists) by
padding each item in the original list, sorting it and then removing
any leading zeros after the sort. While I'm certain this would work,
it would doubtless make it take even longer than it does now. I'm
hoping that the latter sort above is simply more efficient for large
lists and can be modified to sort a list of lists.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.