Re: Indexing text, pdf, .doc
Re: Indexing text, pdf, .doc
- Subject: Re: Indexing text, pdf, .doc
- From: Joseph Heck <email@hidden>
- Date: Fri, 1 Nov 2002 16:30:18 -0800
It would require more setup and programming effort, but you might also
consider Lucene (java/Apache). I'm not immediately aware of something
to scan MSWord files and index their contents, but I'd suspect such an
addition has been made for lucene, and it's indexing and
full-text-search mechanisms are quite advanced.
-joe
On Friday, November 1, 2002, at 04:17 PM, Michael Johnston wrote:
Not sure about an embeddable library, but here are a couple good
standalone indexing/search engines:
http://htdig.org/ is open source, and is perl
http://alkaline.vestris.com is very fast and inexpensive, but not open
source
PDF and Word should be converted to text or html and the resulting
file indexed. Everyone uses http://www.foolabs.com/xpdf/ for pdf;
htdig has a doc2html script for word.
Michael Johnston
On Friday, November 1, 2002, at 06:12 PM, Steve Ivy wrote:
I'm doing some research for an app and one of the things I need is
the ability to index (and subsequently search, obviously) a store of
content in text documents, pdf files, and Word documents. It can be
Java or Obj-C. I prefer not to use straight C simply due to my own
limitations in the language. I'm wondering if anyone has knowledge of
anything like this. What is Apple using in Sherlock/iTunes/etc?
Whatever became of AIAT?
TIA,
--Steve
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.