Re: objective-c / cocoa efficiency
Re: objective-c / cocoa efficiency
- Subject: Re: objective-c / cocoa efficiency
- From: b a r t o n <email@hidden>
- Date: Tue, 8 Mar 2005 18:13:59 -0800
All of your thoughtful answers are extremely helpful. I have run the
code against shark and my methods are trivial. I will let you know how
it turns out.
On Wed, 9 Mar 2005 12:59:59 +1100, Jeff Laing <email@hidden> wrote:
> > If you really have that many strings and want to pick out one of them
> > quickly, you should put them in a dictionary (i.e. NSDictionary), or
> > hash table (i.e. unordered_set). A balanced tree (i.e.
> > std::set) would
> > also be an improvement.
>
> Well, firstly, I did say this two paragraphs later on:
>
> (quote)
> (Yes, this is way suboptimal, use a hash-table or something similiar or
> better, a flex parser - in both cases, these are harder to code and keep
> straight during development (when addition of new keywords require more than
> just a .c recompile)
> (/quote)
>
> > If you want efficiency, switching on the first character and then a
> > block of strcmp's is really a pretty lame solution. You've just moved
> > the problem one character deeper into the string—in other words, you're
> > taking an O(n) problem and just dividing by a constant for an O(n/k)
> > solution. Why settle for less when there are constant-time (hashing) or
> > logarithmic-time (tree-based) solutions?
>
> The thing is, when you are pissing about at small values of N, O(n) becomes
> computationally confusing where cost vs benefit becomes non-intuitive.
> qsort() typically cripples itself back to a bubble sort for low N - no-one
> ever points out that qsort() has O(N*N) behaviour for small N, because of
> the overhead it has to set up.
>
> Lets assume 2 keywords for each of 'a' to 'z'. Clearly, its better to do 2
> compares than 52. Yes, k=26 isn't big in this case, but so what? Its a
> reasonable gain for very little coding effort. And if going to a hash table
> takes me from 2 to 1, then its harder to get excited about it. Thats, if it
> does.
>
> Its quite conceivable that I only have one token per first letter in my
> recognition table. In those cases, computing the hash of the input string
> is going to be pretty close to the cost of comparing two strings, especially
> on architectures that have a "smart" strcmp(). The hash solution still has
> to do a strcmp() anyway to ensure that the value it hashed to is actually
> the one being sought, since multiple strings can hash to the same value
> (unlikely, but it has to be coded that way). So, in the case where I have
> one keyword per letter, I can expect the hash solution to be SLOWER than the
> strcmp()
>
> Similiarly, there are about 230 first characters that will *never* have a
> token in the table, and the switch discards them immediately without
> calculating their hashes at all. Thus, guaranteed faster in those cases.
>
> Note that I'm not saying the switch/strcmp() approach is optimal in all
> cases, just that its more efficient that 100 consecutive strcmp()'s, and its
> not guaranteed that its worse than a hash table (in my experience).
> Personally, I think the sun shines out of flex parsers in this domain, but
> the tough part is the overhead of precanning an input stream for it to work
> through, and dealing with its tendency to "look ahead".
>
> Having said that, I've written compilers that don't waste their time using
> flex to recognise individual keywords because its just as easy to binary
> search through our limited keyword table afterward, and its a lot easier to
> write *re-usable* lexical scanners that way - we have one scanner we use in
> ALL our compilers, each of which have different keywords.
>
> The switch/strcmp() approach is a lot easier to read/debug than a flex
> parser. Of course, its also easier to goof up (by forgetting the strcmp()
> returns FALSE for a match).
>
> In conclusion, let me sum up by saying that the original contention that I
> was responding to (that the person recommending switch/strcmp() was a twit),
> I'd disagree strongly without knowing all the facts because in *my*
> experience, its a valid and very efficient construct for specific problems.
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden