Re: Spellchecking queries to a database
Re: Spellchecking queries to a database
- Subject: Re: Spellchecking queries to a database
- From: Arturo PĂ©rez <email@hidden>
- Date: Wed, 22 Oct 2003 22:28:59 -0400
On Wednesday, October 22, 2003, at 08:24 PM, petite_abeille wrote:
"google-known" spellchecking features
There is more to Google's "Did you mean?" functionality that meet the
eyes...
In any case, you could get started by visiting WordNet [1] for some
raw data.
I used to do this sort of thing for a living (up until about 3 months
ago). That google
spellchecking thing was the bane of my existence. I've hated them ever
since they unveiled it :-)
It looks so simple.
It appears to work so well.
Everybody wants it.
but...
Wordnet doesn't have all the information necessary to duplicate the
functionality. No taxonomy
that I'm aware of does. The functionality can't be duplicated with any
RDBMS. To try and do
it with a natural language search engine like lucene would negatively
impact search performance,
to put it mildly.
There's a reason that Google has 50 PhDs in mathematics and natural
language processing. To make the
rest of us miserable. :-) For example, how does it decide to suggest
dogg from doggl? Or to suggest
immeasurable from imesurable? None of the stemming algorithms will do
that. It must be some sort of
distance metric. But (optimal) string transformations of that sort are
NP-complete IIRC. So you need
massive amounts of computes to do it. Google's 5-10K (that's
thousands) of CPUs probably strain a little
doing it :-)
Anyways, I always had fun working on that stuff. I'd be happy to share
what I know about the topic.
-------
WebObjects in Philadelphia. You want a cheesesteak with that?
Visit http://webobjects.meetup.com
_______________________________________________
webobjects-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/webobjects-dev
Do not post admin requests to the list. They will be ignored.