Re: ERIndexing
Re: ERIndexing
- Subject: Re: ERIndexing
- From: Kieran Kelleher <email@hidden>
- Date: Thu, 5 Nov 2009 12:28:20 -0500
You are using the EO in ERAutoIndex, which would not work, but you do
allow for custom subclasses of ERIndex to be specified in indexModel,
so an Index class that faults fresh data from the db for the index
update would be feasible. I need to think more about multi-instance,
multi-node WO cluster ..... :-)
On Nov 5, 2009, at 11:52 AM, Anjo Krank wrote:
Maybe you're right, maybe not:) Did you take the various
synchronizers into account?
Also, I'm not sure if the transaction uses the EO or simply the GID.
If the EO, you'd be using the fault and not the DB.
And finally, I'm not really sure about the configuration of
relations. I *think* the one I used for ERCAuditTrail is better, but
there was some reason or other why I couldn't really use it.
Cheers, Anjo
Am 05.11.2009 um 17:33 schrieb Kieran Kelleher:
OK, back to ERIndexing discussion for my 5-minute "break" ......
Anjo, does it matter that p2 updates the index before p1 since
updating the index consists of reading the EO from the database?
In the example shown below "p2 updates index" would update the
index to reflect the EO after "p2 changes data" and the out of FIFO
order "p1 updates index" would simply cause the index to again
reflect the EO after "p2 changes data". In other words, because the
of the latency or lack of synchronization, the index never
reflected the changes of "p1 changes data", but who cares since a
few milliseconds later p1's changes were replaced in the database
by p2's changes anyway.
Corrrect me if I am not thinking logically here, but the example
below should be:
p1 changes EO1,
p2 changes EO1,
p2 updates lucene 'Document' for EO1 from database,
p1 updates lucene 'Document' for EO1 from database.
Result, lucene Document for EO1 reflects the last state of the
database - all is good?
-Kieran
On Oct 19, 2009, at 9:23 AM, Anjo Krank wrote:
It's not a matter of thread safety, it's a matter of data in
lucene being the same as in your DB. When you run with multiple
instances and have heavy edits, you can easily construct a case
like: p1 changes data, p2 changes data, p2 updates index, p1
updates index.
When you only have one edit app, that's obviously OK (and I'm
using a queue for the single app mode anyway).
Cheers, Anjo
Am 19.10.2009 um 15:06 schrieb David LeBer:
On 2009-10-19, at 8:26 AM, David LeBer wrote:
On 2009-10-19, at 8:13 AM, Mike Schrag wrote:
it's not like you wouldn't have the exact same problems in
lucene-proper, though ...
We are using ERIndexing for a multiple instance single server
deployment. However, the app is readonly for the indexed EOs,
there is an admin app that writes, but that is only a single
instance.
And based on the Lucene docs, it's writers and readers are thread
and process safe, which means that multiple writers can access
the same index file.
Doug Cutting has posted on the topic of thread safety a couple
of times. Indexing and searching are not only thread safe, but
process safe. What this means is that:
• Multiple index searchers can read the lucene index files at
the same time.
• An index writer or reader can edit the lucene index files
while searches are ongoing
• Multiple index writers or readers can try to edit the lucene
index files at the same time (it's important for the index
writer/reader to be closed so it will release the file lock).
Not sure how well this works in practice and/or how file system
dependent it is for the file system locks to function correctly.
On Oct 19, 2009, at 7:50 AM, Gustavo Pizano wrote:
NICE!, now my hopes are gone!.
so I guess I must make use de facto lucene framework. and
follow the examples in LIA?
ok.. what can one do... :(
thx
G.
On Mon, Oct 19, 2009 at 1:45 PM, Anjo Krank <email@hidden>
wrote:
Be aware that ERIndexing is only an experiment (and was write-
only code, I don't use it yet). In particular it has several
severe drawbacks:
- it doesn't really handle multiple instances (possibly) or
servers (definitely). That means, for the cases where you
actually *do* need the speed of lucene, ie. high-traffic, high-
volume which means many servers, you can't use it as is. At
least the auto-indexing won't work without some central
notification point that actually does the indexing and then
redistribute the indexes.
If you don't account for that, your indexes won't really match
your DB, which means that you will find the wrong stuff super-
fast...
I don't have a good solution to this, maybe someone who
actually uses it might.
- The DB store for the indexes was an experiment to fix at
least the redistribution problem, but this was truly write
only, so use at your own risk.
- it duplicates your DB indexes and depending on your DB type
and query, your query to resolve the faults probably won't be
that much faster than a normal query would have been.
- it should really be an EO adaptor instead, which would mean
that you could use it in a simple displayGroup. But then
again, one of the main points in Lucene is that you don't
really need a strict schema to work with it - although you'll
probably have one.
Cheers, Anjo
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden