Re: Dynamic Records?? (or other alternatives)
Re: Dynamic Records?? (or other alternatives)
- Subject: Re: Dynamic Records?? (or other alternatives)
- From: email@hidden (Michael Sullivan)
- Date: Tue, 4 Dec 2001 14:44:35 -0500
- Organization: Business Card Express of Connecticut
Arthur J. Knapp writes:
>
A number of great vanila methods for acomplishing this have been
>
posted in the past. Check out these emulated hash techniques:
>
<http://www.esglabs.com/snippets/map.sit.hqx>
>
<http://files.macscripter.net/ScriptBuilders/ScriptTools/hashLib.hqx>
Thanks for this. I missed this post earlier and I've been wondering
what the heck has was talking about.
>
> Does anyone know if getting/setting the value of a record key is O(1) in
>
> Applescript? If so, that would be a good reason to seek out the hackish
>
> solution for your script, since the O(n) behavior of lists would start
>
> to show on long pieces of text for your word count (your program would
>
> end up O(n^2)).
>
The problem of using big O notation in AppleScript is that things occur
>
on two different levels, the AppleScript-algorithmic level and the C/C++/
>
Pascal level, meaning the code that underlies AppleScript.
Oh that's true, but a Big O difference can still potentially overwhelm
this with large enough data. Of course, when the constant factor
differences are as big as they are here (especially when dealing with
the event handler in addition to the interpreter) it can take a data set
larger than the program will ever see to make a difference.
>
Consider the difference between the these snippits:
>
on is_contained(aList, aItem)
>
repeat with x from 1 to length of aList
>
if item x of aList = aItem then return true
>
end
>
return false
>
end
>
is_contained( myBigList, myItem ) --> true/false
>
>
myBigList contains myItem --> true/false
>
>
>
It is a good bet that these two "commands" are both doing more
>
or less the same thing, traversing myBigList until it finds a
>
match, or not, but it is also a good bet that for even large N,
>
the "contains" syntax is going to always be lightning fast
>
compared to is_contained, making a big O comparison between them
>
irrelevent.
It's not just a bet. large N isn't going to change this one because
they have the same big O, so the huge constant difference is going to
ensure the second version wins, and keeps winning even bigger as N grow
larger.
OTOH, if you managed to implement a hashed or binary search in AS, then
a large N would eventually win out over the standard contains algorithm,
though depending on how what contortions had to be gone through in the
binary/hash method it could take an unreasonably large data set before
it made a difference.
In practice, the sorts in applemods are an interesting data point.
Greg's radixSort turns out to be slower than Serge's qSort on most sets
of data even though it is a smaller big O. But unless something in the
underlying implementation is confounding the radix, there should be some
size of data set for which it will generally outperform qSort.
But n*log(n) is not so much worse than n, while n^2 is just atrocious.
Probably either of their mods will beat a simple bubble/selection sort
on moderate size pieces of data (say a few thousand elements) even if
the naive sort were implemented in C (I should do this just to test it).
Anyway, the summation is that big O is not completely irrelevant in AS,
and the larger the data set, the more likely it is to matter. Large big
O differences (like n v 1, or n^ v n) are going to matter even on medium
size data sets, which is why people are going through such contortions
to figure out how to hash in AS.
>
P.S. I'm still just a scripter... ;-)
I don't know. Looks to me as though a programmer could be made of you,
grasshopper. :)
Michael
--
Michael Sullivan email@hidden
Business Card Express of Connecticut Thermographers to the Trade
"You hate your job -- why didn't you say so? There's a support group
for that. It's called everybody; they meet at the bar." -Drew Carey