Re: Manipulating large amounts of data with EOF - feedback, anyone?
Re: Manipulating large amounts of data with EOF - feedback, anyone?
- Subject: Re: Manipulating large amounts of data with EOF - feedback, anyone?
- From: Andrew Lindesay <email@hidden>
- Date: Sun, 11 Jan 2009 00:13:26 +1300
Hello Hugi;
"KMMassiveOperation" <--- that's a really great class name.
What I do is to pull GID's or PK's for all of the objects involved
using raw rows. As you say, sometimes the data sets are large enough
that I have to further subdivide them on a domain-specific basis, but
I won't complicate matters further... In any case, I get lots of
GID's or PK's. Then I batch them up into (for example) lots of 100 or
so and then farm the work-load out over JMS (more recently using JSON-
RPC through a "JMS adaptor") so that the processing is able to run
concurrently over a number of instances on a number of hosts. The
number of instances involved increases the concurrency and hence the
pressure on the database system. In the case of writing out CSV or
Excel-readable XML files, I push the results from the workers into a
"BLOB stream" -- effectively just a series of BLOBs that make up one
long piece of contiguous data. The control and monitoring systems
for all this are quite complex, but it does work well and I can do it
all in EOF without resorting to SQL.
cheers.
Anyway, I would love to hear how other folks are handling huge
datasets. I would love fedback on the technique I'm using, and ieas
for improvement would be great. Just about the only idea I'm not
open to is "just use JDBC" ;-). I've been there and I don't want to
be there. That's why I'm using EOF :-).
___
Andrew Lindesay
www.lindesay.co.nz
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden