Re: Search results in an out of memory exception
Re: Search results in an out of memory exception
- Subject: Re: Search results in an out of memory exception
- From: Dov Rosenberg <email@hidden>
- Date: Mon, 25 Oct 2004 17:12:05 -0400
I agree with Chuck, we have integrated Lucene into our CMS system. It
ROCKS!!
The best part about it is that you don't have the EOF overhead of dealing
with large numbers of records. Lucene's capabilities seem limited only by
disk space.
You can also index any binary document if you can extract the text from it.
We use PDFBox to grab the text out of attached PDFs for indexing.
--
Dov Rosenberg
Conviveon Corporation
http://www.conviveon.com
On 10/25/04 5:04 PM, "Chuck Hill" <email@hidden> wrote:
> I'm not certain that you need to get the content out of the database, but
> (2) and Lucene will certainly boost performance and drop memory usage.
> I'll strongly suggest going with something that includes Lucene in the mix.
> Lucene is awesome!
>
> Chuck
>
>
> At 10:17 AM 25/10/2004 -0700, David Holt wrote:
>>>>>
> I have just upped my test database of documents from 1500 records to
> 17,000. I have a WOComponent with a WODisplayGroup and the appropriate
> qualifier fields for searching. If I qualify the data source by putting a
> value in one of the search fields I get a list of documents as expected. If
> I submit the form without information in any of the qualifier fields (this
> used to return the whole data set divided into paged results), I get the
> following exception from the application after a minute or so of waiting:
> Error:
> com.webobjects.foundation.NSForwardException [java.lang.OutOfMemoryError]
> null
>
> It is a MySQL database, WO 5.2.3, OS X Server 10.2
> I have a blob field that holds the text content of the documents (for
> searching) as well as a URL field pointing to the original document on the
> file system. One of the qualifier fields is used to search the content field.
>
> Three strategies I can think of to fix the problem are:
> 1. Increase system memory (not a good long term solution as the documents
> will grow over time)
> 2. Put the blob field in a separate table so it is not loaded with the
> WODisplayGroup (not sure if I can still do searches on that field if I do
> that)
> 3. Get the content out of the database and use a combination of PDFbox and
> Lucene to provide the content searching separate from my database.
>
> What would you suggest is the best strategy? Or have I misidentified the
> problem?
>
> Thanks,
> David
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden