Re: Searching for "whole word" in NSString
Re: Searching for "whole word" in NSString
- Subject: Re: Searching for "whole word" in NSString
- From: Deborah Goldsmith <email@hidden>
- Date: Tue, 05 Feb 2008 19:30:38 -0800
This doesn't work for all languages. What constitutes a "word" is
rather more complex than this. In Thai, for a particularly egregious
example, you can't find word boundaries without looking up the words
in a dictionary.
On Tiger, you can use either the double-click API in Cocoa, or
UCFindTextBreak, to find word boundaries. On Leopard or later, use
CFStringTokenizer. All of them will do the right thing for word
boundaries in every language we support.
Deborah Goldsmith
Apple Inc.
email@hidden
On Jan 29, 2008, at 12:28 PM, Mike Wright wrote:
On Jan 29, 2008, at 10:12:21 -0800, John Stiles
<email@hidden> wrote:
I'm trying to find a substring in an NSString. But I want to find
whole
words (e.g. like in the Find panel when you choose "Full word" from
the
popup, rather than "Contains" or "Starts With").
Unless I'm missing something, it looks like NSString's
-rangeOfString:options:range:locale: doesn't have an option for
finding
whole words.
How does the Find panel do it, then? Am I going to have to "roll my
own"
code for string searching? That sounds error-prone to me; I'd much
rather have the OS do it.
Here's a Tiger approach that's worked pretty well for me (or, at
least, no non-English-using customers have complained--so far).
NSString *fieldContent; // the string I'm searching in
NSString *targetString; // the string to be found
NSRange hitRange; // the range of targetString found within
fieldContent
NSRange testRange; // in the beginning, this covers all of
fieldContent
BOOL caseSensitive; // specified by the user
BOOL isWholeWord = NO; // this is used in two sequential tests
// set up the search mask
unsigned searchMask = NSLiteralSearch;
if (! caseSensitive)
searchMask |= NSCaseInsensitiveSearch;
// set up the character set for words
NSCharacterSet *wordCharacterSet = [NSCharacterSet
alphanumericCharacterSet];
// look for targetString in fieldContent
hitRange = [fieldContent rangeOfString:targetString options:
searchMask range:testRange];
// if we found something, do the whole-word test
if (hitRange.location != NSNotFound)
{
// test the beginning of targetString
isWholeWord = ((hitRange.location == 0) || (! [wordCharacterSet
characterIsMember:[fieldContent characterAtIndex:(hitRange.location
- 1)]]));
// if the beginning is okay, test the end of targetString
if (isWholeWord)
{
unsigned nextCharPosition = hitRange.location + hitRange.length;
isWholeWord = ((nextCharPosition == [fieldContent length]) || (!
[wordCharacterSet characterIsMember:[fieldContent
characterAtIndex:nextCharPosition]]));
}
}
Finally:
if (isWholeWord)
{
// show it to the user
}
Hope this helps. (And, since it's not just copied from my own code,
I hope it doesn't contain any serious errors.)
Regards,
Mike Wright
http://www.idata3.com/
http://www.raccoonbend.com/
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden