Re: finding substring
Re: finding substring
- Subject: Re: finding substring
- From: Chuck Soper <email@hidden>
- Date: Fri, 31 Mar 2006 23:03:16 -0800
At 12:02 AM -0500 4/1/06, Daniel Jalkut wrote:
On Mar 31, 2006, at 10:26 PM, Chuck Soper wrote:
This answer makes sense from a programmer's
perspective, but from a user's perspective it
might be confusing. For example, if someone
searches for "San Jose", the results include
San Jose, California but not San José, Costa
Rica.
My English atlas shows San Jose, California and
San José, Costa Rica. I suspect that most users
think of the two city names as being the same,
but they're not.
Do you think that striping diacritical marks
makes sense when comparing some geographical
names/languages, but not all, such as localized
Japanese names? If so, is there a way to make a
distinction?
It sounds to me like the problem you are dealing
with is more general than just diacritical
markings. The identification of geographical
location names seems especially dependent on
localization. A good example was in the news
lately in the US. The Winter Olympics were held
in Torino, Italy. But many US newspapers
described the event as being held in Turin. No
amount of diacritical exemption will help the
user who searches on Turin find the name if it's
indexed as Torino.
I handle that particular case by using this
geographical name "Torino (Turin)". Atlases use
the same approach. Currently, our application is
only in English and I'm still interested in
finding a way to search for sub-strings and
ignore diacritical marks. I believe that San
José, Costa Rica is referred to (localized) as
"San José" in Spanish and English, yet English
speaking users will want to search for it without
the accent.
How you deal with this largely depends on where
you're getting your data from and what
localizations you're targeting. If your
geographical names are all "basically English,"
and your target localization is English, then
maybe it makes sense to just do a diacrtiical
insensitive search. But if your geographical
names come from a more complicated source like
the real world, you might have to resign to
identifying places by a number of search terms.
So "San Jose" matches "San José", "Turin"
matches "Torino", etc.
It seems like this is a special situation where
planning for multiple matching keys might be the
right choice from the get-go.
Daniel
I completely agree (for a future version). Thanks for the suggestion.
Chuck
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden