• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: SearchKit vs. Lucene
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SearchKit vs. Lucene


  • Subject: Re: SearchKit vs. Lucene
  • From: Joseph Heck <email@hidden>
  • Date: Thu, 20 Nov 2003 19:18:52 -0800

I've done some research of my own, but YMMV -

Ultimately Lucene is more flexible, simply because you've got it ALL right there to mess with if you want. SearchKit does some things "out of the box" that Lucene doesn't: Forward Indicies and "find similiar documents". SearchKit doesn't have a concept of a custom analyzer that you can create - at least as of yet. It does include a whole pile of really wonderful language options (japanese text word boundary stuff, for example) and a decent selection of "read in the document format and index it" options: word, html, pdf that Lucene doesn't have "out the box". SearchKit includes the concept of a prefix search (which I'd normally think of as a filter function) which is the functionality you see when using iTunes. SearchKit supports boolean, ranked relevance, similiarity, inclusion/exclusion, and prefix searches.

Multiple indicies are easily (to me) searched in either.

In the category of "just different", there's a substitutions mechanism within SearchKit set up such that you define words that should be treated as identical for the purposes of search.

I haven't hammered on the results enough to know if there's stemming (Lucene does a porter stemming) happening under the covers. SearchKit also supports the concept of a document hierarchy, although it's not clear to me what the benefit is and how it should be utilized. There's only a few snippets of detail about it in the documentation.

I have not profiled "speed", either of searching or indexing. I've simply used it and found it acceptable for the small project that was my testbed.

(Worth noting that a Forward Index mechanism was implemented in MIT's Haystack code that would prove useful to someone wanting to create a find-similar mechanism that required it)

-joe

On Thursday, November 20, 2003, at 05:50 PM, Stuart Halloway wrote:
First, thanks to the several people who pointed me to SearchKit.

Now, anybody compared SearchKit vs. Lucene? I'll be picking one of the two within the next few weeks and will happily share my findings with anyone who's interested. I'll be working with modest data sets but many data formats so my issues are more around flexible analysis and ease of use than performance.

Stu
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

References: 
 >SearchKit vs. Lucene (From: Stuart Halloway <email@hidden>)

  • Prev by Date: Re: I can't figure out how to use NSIndexSet
  • Next by Date: Re: Report printing
  • Previous by thread: SearchKit vs. Lucene
  • Next by thread: Re: SearchKit vs. Lucene
  • Index(es):
    • Date
    • Thread