[OT] Time complexity (was: Where is NSList?)
[OT] Time complexity (was: Where is NSList?)
- Subject: [OT] Time complexity (was: Where is NSList?)
- From: Allan Odgaard <email@hidden>
- Date: Thu, 29 Jul 2004 23:59:27 +0200
On 29. Jul 2004, at 21:55, Stephen Checkoway wrote:
Sorry if this seems pedantic, but...
What many people fail to realize (including some in this thread) is
that
this is only supposed to be a rough first estimate since constants are
totally ignored in this analysis but of course they really matter. Let
me
give an example:
Quicksort has a worst case bound of Theta(n^2) running time while
merge or
heap sort have a worst case bound of Theta(n log n) which is clearly
better. However, both merge and heap sort are slower in practice than
quicksort.
While your point about time complexity not revealing constants is valid
(although if an algorithm has high constants it should be added as a
footnote), it is not constants which are at play here.
Quicksort is a randomized algorithm and the *expected* running time
certainly is as good or better than that of the merge sort, so coupled
with the better cache utilization we *expect* it to outperform merge
sort, but that is only for a (large) subset of inputs -- for this
reason stl's introsort is a hybrid which internally switches to a heap
sort IIRC if the number of recursions exceed some threshold, to
guarantee a better running time than worst case quicksort, which, given
the (im)proper input, will slow down to a crawl ;)
Introsort furthermore switches to insertion sort when there are less
than 10-20 elements, because for such low values of n, the constants
are dominating, and quicksort being recursive, there will be lots of
recursions with n < 20 or so.
Here's an example related to data structures rather than algorithms
that
operate on them. An implementation of a stack (or a queue) using a
linked
list has a best, average, and worst case running time of O(1) for
push/pop
(enqueue/dequeue). An array based implementation on the other hand only
has an amortized running time of O(1) since pushing too many elements
causes the array to be resized which takes linear time.
But that's sort of what the amortization is about. Adding n elements to
the array will be O(n). So for adding n elements to an array, the time
complexity is exactly the same as for the list.
But adding only *one* element to the array may, for very few elements
(every 2^n'th for the doubling strategy) be O(n), where as for the
rest, it is O(1).
So again, it's easy to make an example where the array is in fact
slower than the linked list. E.g. create an array with 16,383 elements
(adding the next should then resize it to 64KB) and time the add, then
time an add for a list with the same number of elements. The list add
will be much faster.
However, in practice, it is much faster to use an array and maintain
an index (or two
indices in the case of a queue) rather than a pointer to the top (or
head
and tail).
That is because you allocate memory for each list node, and memory
allocation is often not O(1), so effectively adding to the list is not
O(1) (but secondly there is the constants, which is higher for the
list, because it requires setting up 4 pointers, and finally there is
the cache advantages of the array).
But here we are comparing algorithms with the same time complexity, and
here it might be useful to look at the constants. But when algorithms
have different time complexity, constants rarely matters.
The time complexity is really only a measure for how the algorithm
scales, which is what one should care about when choosing/designing
data structures/algorithms. Which type of balanced tree we use for our
data affects performance much less than say, choosing the balanced tree
over an array with linear searches.
Furthermore, the time complexity tells us if we can expect problems.
E.g. having something which is O(n^2) or higher is most likely going to
be a problem for some user (if n is unbounded), changing this to say,
O(n lg n) will in 99% of the cases not be a problem for any users, no
matter how high n gets.
For readers following this, try to plot the two graphs if you are not
intuitively familiar with them, as can be seen, the first one explodes
as n increases, whereas the second behaves quite nicely and "bounded".
O(n^2) can often occur in simple applications, like e.g. iTunes, assume
we have all songs in an array, we have another array with selected
artists, now we need to construct the array of all songs by the
selected artists. A naive implementation would iterate over the array
of selected artists, and for each, iterate over all the songs and copy
those which match.
Imagine I have 10,000 songs all by different artist, and select all the
artists, that's 10,000 x 10,000 iterations = 100,000,000 -- even on my
G4, that is an operation I can feel, and I am quite sure that a lot of
programs work like this ;)
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.