Re: Quick string-set search
Re: Quick string-set search
- Subject: Re: Quick string-set search
- From: Greg Titus <email@hidden>
- Date: Fri, 3 Aug 2001 11:15:53 -0700
On Friday, August 3, 2001, at 06:17 AM, Mark Onyschuk wrote:
>
Most often when you need to do this, you'll build a Trie which is a
>
kind of n-way tree with words at either the nodes or leaves (IIRC -
>
it's been a while since CS324 :-).
Actually, tries are great for fast string matching or prefix matching,
but they don't do arbitrary substring matching. To do fast substring
finding, you need a PAT tree instead, which is sort of like a trie that
includes every possible tail of each string. (That is, for the word
"tree", you'd include "tree", "ree", "ee", and "e" in the tree. But if
you know you are doing that then you can make a more efficient
implementation then just adding all those entries to a normal trie.)
You'll find a fast Trie implementation that also hooks into our string
scanner class in OmniFoundation. We use them for tag and attribute
matching in OmniWeb. But we don't have a PAT tree class, and you don't
really need one unless the text you are searching over is very large.
Ondra's suggestion of just running through all the strings and looking
for the substring in each is likely to be fast enough for most cases.
Especially if you add the obvious optimization: remember the last string
that you searched for, and if your new search string has the old search
string as a substring, you only need to search the previous set of
matches instead of the entire list.
-Greg
>
>>>>>>> Brian Webster (BW) wrote at Thu, 2 Aug 2001 22:32:17 -0500:
>
> BW> Can someone point me to sample code or an algorithm description
>
> BW> for doing a quick substring search in a set of strings? I'm
>
> BW> looking to implement a behavior like that in iTunes' search or
>
> BW> Explorer/OmniWeb's address completion; namely, given a set of
>
> BW> strings and a substring, find all the strings that contain the
>
> BW> substring.
>
>
>
> Would be something like
>
>
>
> NSArray *allTheStrings...
>
> NSString *searchThis...
>
>
>
> NSEnumerator *en=[allTheStrings objectEnumerator];
>
> NSString *s;
>
> NSRange rr;
>
> while (s=[en nextObject])
>
> if ((rr=[s rangeOfString:searchThis]).length>0) break;
>
> // either rr.length==0, or s contains sought string and rr shows where
>
>
>
> sufficient, or do you need more flexibility, effectivity, etc?
>
> ---
>
> Ondra Cada
>
> OCSoftware: email@hidden http://www.ocs.cz
>
> private email@hidden http://www.ocs.cz/oc
>
> _______________________________________________
>
> MacOSX-dev mailing list
>
> email@hidden
>
> http://www.omnigroup.com/mailman/listinfo/macosx-dev
>
>
>
_______________________________________________
>
MacOSX-dev mailing list
>
email@hidden
>
http://www.omnigroup.com/mailman/listinfo/macosx-dev