Re: Searching for "whole word" in NSString
Re: Searching for "whole word" in NSString
- Subject: Re: Searching for "whole word" in NSString
- From: John Stiles <email@hidden>
- Date: Wed, 06 Feb 2008 15:33:55 -0800
Gah! Something is eating the contents of my messages. I apologize.
Resending again...
--
I tried switching to Japanese locale, and this caused Kanji and hiragana
to behave more like katakana—adjacent blocks of hiragana or kanji would
all be selected as a contiguous unit. It did not seem to conform to word
boundaries in any meaningful way; input like "行きます” would be treated
as two words, "行" and "きます".
At any rate, I've implemented word search in my app using Deborah's
suggestions, and it seems to work well enough for me. The algorithm is
as follows:
- Create a text-break locator with "kUCTextBreakWordMask" and all other
options default
- Use NSString -rangeOfString:options:range: to find the string
- After the string is found, expand the found "range" by up to ten
characters in each direction. (Ten is arbitrary, but I didn't know how
many characters of surrounding context would be necessary to fully
identify a word as such. I can raise or lower this value as needed.)
- Use NSString -getCharacters:range: to read the expanded range into a
buffer of unichars
- Use UCFindTextBreak with options "kUCTextBreakGoBackwardsMask |
kUCTextBreakLeadingEdgeMask", pointing at the start of the word within
the buffer. Make sure the result is still right at the beginning of the
word.
- Use UCFindTextBreak with options "0", pointing at the end of the word
within the buffer. Make sure the result is still right at the end of the
word.
If this all succeeds, then I consider the word to be a "whole" word.
Otherwise I keep searching.
In practice, so far this appears to work really well for English. I
haven't experimented with any foreign stuff but already this is better
than relying on the double-click method because it can find groups of
words ("Hi there") or search strings that include punctuation ("Hi.")
without problems.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden