Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Expanding Import

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Expanding Import

Subject: Re: Expanding Import
From: Chuck Hill <email@hidden>
Date: Thu, 9 Mar 2006 10:03:12 -0800

Hi Scott,

On Mar 8, 2006, at 5:13 PM, Scott Winn wrote:

Also, a little voice keeps whispering in my ear, "This is not the problem." I keep looking at this and thinking that you just don't have enough objects in these relationships to cause this problem and that your processing should not be triggering off massive fetches all the time. How many objects, on average, do you expect to be in each of:
Company ->> Certificates
Shipper ->> Tickets
Location ->> Item
I ran the importer up to the file prior to the one it always chokes on. This file has two Certificates in it and I am printing out updatedObjects() to the Run Log before saveChanges() on each. The first of the two is the largest and starts out printing to the log quickly, but eventually slows down to a crawl. As soon as it hits the next smaller Certificate the updatedObjects() printing appears to go much faster. I'm not sure how much to read into that, but I did some more detailed memory analysis (see below).
By that point I have . . .
4 Locations with 24,676 : 2,350 : 1,171 : 570 Items respectively. 31 Shippers with 548, 36, 41, 34, 11, 6, etc. Tickets each (548 is by far and away the most) 1 Company with 81 Certificates

In the larger of the two updatedObjects() calls for that file I have . . .
331 Certificate properties
582 Tickets properties
23,935 Item properties

None of that seems too out of line. Fetching that number of objects is not going to slow the app to a crawl if done efficiently. Have you tried turning on SQL logging to see if lots of SQL is getting generated when it slows down? Depending on the cause, configuring some batch faulting may give you a good performance boost. The default is none (single row selects). You could try changing it from 0 to 10 on each of these to-many relationships and see what difference that makes.

How much heap space is the application running in?
JVM_OPTIONS was set to -Xms128m and -Xmx512m. When I check total memory it never seems to get much above 128MB even though there should be room for it. I have tried -Xmx1024m and it doesn't seem to make any difference to the total memory or the app's performance. I also tried setting the minimum -Xms256m and didn't notice any significant speed gains. I did see a difference in free memory, obviously, but all the slow downs still occur in all the same places.

Yeah, that should be plenty of heap space. I think we can discard garbage collection as the culprit.

Could you just be running low on memory and going into repetitive garbage collection cycles?
I'm garbage collecting explicitly after each file. Usually I'm in the 70-80% range of free memory before and the 90% after. When things get busy it looks like this. . .
Before gc()
free memory: 4444456
total memory: 133103616
(3 % free)  (31 % free with 256MB)
After gc()
free memory: 127420928
total memory: 133103616
(95 % free)
. . . probably not too telling, since I have already disposed of my workhorse EC by the time this gets called.

When I increased the minimum memory to 256MB the app still takes several minutes to print out the largest of the updatedObjects() calls. The only other thing I can think to do on the memory front is get some output before and after the ec.saveChanges() rather than at the end of the file. . . so I did. Everything looks pretty normal. Free memory is about 70% before each ec.saveChanges and 80% after each ec.dispose(). The only odd thing is that every once in a while the memory drops significantly after the dispose(). On a few occasions it plummets from 80% before saving to 5% after the dispose, but that doesn't seem to be bogging down the app. It recovers and goes on its way.

By far the worst one is Item -> Location : Location ->> Item. As I am reading through the file, I hit a locationCode that may or may not have a matching DB entry. I need to look it up and create a new one if it doesn't exist. Then I create the Location/Item relationship to the Item I am currently reading in
Using addObjectToBothSidesOfRelationshipWithKey?
Yup. That's what all the newbie examples say to do.
And, usually, that is the best practice. But in the case of this import you may be better off trading object graph consistency for speed and using item().setLocation(aLocation) instead of item.addObjectToBothSidesOfRelationshipWithKey(aLocation, "Location")
Looks like cutting out relationships is the next thing to try, unless I ought to jack up the minimum memory to a few GBs just to be sure.

It really does not look like that is the problem to me.

I might not have been too clear in my comments, but I do understand that the Items in the list are properties not individual objects to be updated. But that kind of takes me back to my original question . . . Is attempting to store this object (and a few others like it) with all of its to-many relationship properties what is bogging down the app?
Maybe.
Is it an obvious, no brainer, yes, or should WO be able to handle a few million properties like this? Seems a bit ridiculous to be asking, but hey, what do I know?
A few million? That would probably cause some performance issues, yes. Do you ever really need all few million of the properties? Will you need to know all the items stored in a location? Or just the (for exmaple) undelivered ones? Or the ones for a specific Company?
No, I don't think I will ever want anyone pulling that amount of data out, and I can't imagine why I would need to. There should always be something else involved in a db lookup -- a company, date range, some other identifying code, etc.

That is a good clue that you may not want to model that relationship then. If you will always be fetching the items, there is no need to carry around the burden of keeping an used relationship up to date.

It appears I need to streamline my relationships first, then when I'm building the bulk of the app try to figure out what relationships I can't live without. It was easy to create the relationships in the first place; it should be just as easy to put them back. Then I'll have to try and figure out how to pull off what Anjo was talking about.
Maybe Anjo will post or point you to some code. Or I can try and whip it out if I find time later.
Hopefully, not necessary, but I'd be grateful nonetheless, if only for some better insight into how one gets tricky with the database context.

Well, maybe a little warm up exercise this morning...

OK, for this I am assuming that import is all this program does and that we don't have to worry about concurrent users in other editing contexts etc.

EOEditingContext ec;  // I am assuming this exists and is locked

// Set database context delegate to the object processing the import. // Assumes that no other objects will be using this context EODatabaseContext dbContext = databaseContextForModelNamed(ec, "YourModelNameHere); dbContext.lock(); try { dbContext.setDelegate(this); } finally { dbContext.unlock(); }

... process import here ...


// clear database context delegate
dbContext.lock();
try {
    dbContext.setDelegate(null);
}
finally {
    dbContext.unlock();
}


Then add this to this class:

protected static NSArray ignoredEntityNames = new NSArray(new Object []{"Item", etc.}); // Delegate method public NSArray databaseContextShouldFetchObjects( EODatabaseContext dbCtxt, EOFetchSpecification fetchSpec, EOEditingContext ec) { // You might need to so some more exact matching here... if (ignoredEntityNames.containsObject(fetchSpec.entityName()) { return NSArray.EmptyArray; }

    return null;
}


Chuck

-- Coming in 2006 - an introduction to web applications using WebObjects and Xcode http://www.global-village.net/wointro

Practical WebObjects - for developers who want to increase their overall knowledge of WebObjects or who are trying to solve specific problems. http://www.global-village.net/products/practical_webobjects


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Re: Expanding Import (From: Scott Winn <email@hidden>)




Prev by Date:
Re: System time

Next by Date:
Session timeout behavior curiosity

Previous by thread:
Re: Expanding Import

Next by thread:
java.lang.outofmemory

Index(es):

Date
Thread