• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Indexing text, pdf, .doc
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Indexing text, pdf, .doc


  • Subject: Re: Indexing text, pdf, .doc
  • From: Joseph Heck <email@hidden>
  • Date: Fri, 1 Nov 2002 16:30:18 -0800

It would require more setup and programming effort, but you might also consider Lucene (java/Apache). I'm not immediately aware of something to scan MSWord files and index their contents, but I'd suspect such an addition has been made for lucene, and it's indexing and full-text-search mechanisms are quite advanced.

-joe

On Friday, November 1, 2002, at 04:17 PM, Michael Johnston wrote:

Not sure about an embeddable library, but here are a couple good standalone indexing/search engines:

http://htdig.org/ is open source, and is perl
http://alkaline.vestris.com is very fast and inexpensive, but not open source

PDF and Word should be converted to text or html and the resulting file indexed. Everyone uses http://www.foolabs.com/xpdf/ for pdf; htdig has a doc2html script for word.

Michael Johnston


On Friday, November 1, 2002, at 06:12 PM, Steve Ivy wrote:

I'm doing some research for an app and one of the things I need is the ability to index (and subsequently search, obviously) a store of content in text documents, pdf files, and Word documents. It can be Java or Obj-C. I prefer not to use straight C simply due to my own limitations in the language. I'm wondering if anyone has knowledge of anything like this. What is Apple using in Sherlock/iTunes/etc? Whatever became of AIAT?

TIA,

--Steve
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

References: 
 >Re: Indexing text, pdf, .doc (From: Michael Johnston <email@hidden>)

  • Prev by Date: Re: Indexing text, pdf, .doc
  • Next by Date: Bi-directional Text
  • Previous by thread: Re: Indexing text, pdf, .doc
  • Next by thread: Re: Indexing text, pdf, .doc
  • Index(es):
    • Date
    • Thread