Re: Expanding Import
Re: Expanding Import
- Subject: Re: Expanding Import
- From: Chuck Hill <email@hidden>
- Date: Wed, 8 Mar 2006 10:04:42 -0800
On Mar 7, 2006, at 5:16 PM, Scott Winn wrote:
Let me try to give you (and all the other nice people reading
this post) a better picture of what I am doing. Maybe it is
horrifically bad database design, but I can always blame the
legacy system, right? :-)
Or blame me being thick. :-)
What you, thick? All of us wannabes bow to your WOKnowHow.
I dunno. I have my moments.
Certificate (average of 2 per File)
Ticket (about 12 per Certificate)
Item (about 30 per Ticket)
Part (usually 2 per Item)
The main hierarchy (above) consists of to-many relationships
going down and to-one relationships coming back up. The entities
in the hierarchy aren't the problem though because they are not
the objects being fetched. The issue I am having is mainly with
entities outside the hierarchy that are related to entities in
it. There are several relationships like this:
Certificate -> Company : Company ->> Certificates
Ticket -> Shipper : Shipper ->> Tickets
Item -> Location : Location ->> Item
OK, these are what I expect are causing you the problem. That is,
the to-many part of these. A certificate needs to know its
company. But does a company really need to know all of its
certificates? Does it use them? Do you use company().certificates
() in code rather than fetching the certificates for a particular
company? Would it present a problem to use a fetch on certificates
where company = X rather than use company().certificates()?
Consider the answers to these. It may be that you can avoid
modeling the Company ->> Certificates relationships. The same
questions apply to Shipper ->> Tickets and Location ->> Item. If
you do need them, and they do make sense in the context of your
application, then you should go with Anjo's suggestion. In a
nutshell, avoid relationships from lookup objects to transactional
objects. I've seen such relationships modeled very often just
because they can be modeled and it makes sense at first look. But
they can cause performance problems when there are many objects in
the relationship.
I think you have probably hit the nail on the head. It is very
easy to model relationships without thinking of the consequences.
My particular problem is that the rest of the app isn't built. I'm
doing the data end first so I'll actually have some data to test
when building the interface, reports, etc. So for the moment it is
hard to say what I will and won't need. I'll have a specific
Company object in the user's session, that is a given. The rest is
a bit of a question mark.
It may well be that you need them in this scenario. Also, a little
voice keeps whispering in my ear, "This is not the problem." I keep
looking at this and thinking that you just don't have enough objects
in these relationships to cause this problem and that your processing
should not be triggering off massive fetches all the time. How many
objects, on average, do you expect to be in each of:
Company ->> Certificates
Shipper ->> Tickets
Location ->> Item
How much heap space is the application running in? Could you just be
running low on memory and going into repetitive garbage collection
cycles?
By far the worst one is Item -> Location : Location ->> Item. As
I am reading through the file, I hit a locationCode that may or
may not have a matching DB entry. I need to look it up and create
a new one if it doesn't exist. Then I create the Location/Item
relationship to the Item I am currently reading in
Using addObjectToBothSidesOfRelationshipWithKey?
Yup. That's what all the newbie examples say to do.
And, usually, that is the best practice. But in the case of this
import you may be better off trading object graph consistency for
speed and using item().setLocation(aLocation) instead of
item.addObjectToBothSidesOfRelationshipWithKey(aLocation, "Location")
and in due course I save the EC.
When I do the Location fetch (after reading just a few files) I
seem to be pulling in thousands of Items through the relationship
that I don't want to change. The ec.updateObjects() call looks
in part like this. . .
{
values = {
locationName = <com.webobjects.foundation.NSKeyValueCoding$Null>;
locationCode = "99";
Company = "<NCCompany 18424f _EOIntegralKeyGlobalID[Company
(java.lang.Integer)1]>";
locationDescription = <com.webobjects.foundation.NSKeyValueCoding
$Null>;
Items = (
"<NCItem 7ee9eb _EOIntegralKeyGlobalID[NCItem
(java.lang.Integer)538]>
< NCItem 2f7af3 _EOIntegralKeyGlobalID[NCItem
(java.lang.Integer)733]>
. . . A few thousand Items already stored . . .
< NCItem 6c5525 _EOIntegralKeyGlobalID[NCItem
(java.lang.Integer)308]>
. . . Then come the actual updates . . .
< NCItem 5f1fa0 <EOTemporaryGlobalID: 0 0 -64 -88 42 -3 0 0 -54
-4 80 8 0 0 1 9 -62 108 -102 -115 -97 90 -24 -73>>
. . . not nearly as many . . .
< NCItem 2462d5 <EOTemporaryGlobalID: 0 0 -64 -88 42 -3 0 0 -54
-4 67 15 0 0 1 9 -62 108 -102 -115 -97 90 -24 -73>>");
};
this = "<NCLocation 97048e _EOIntegralKeyGlobalID[NCLocation
(java.lang.Integer)1]>";
},
You are somewhat mis-interpreting that. You are confusing the
contents of an object that will be updated with the objects that
will be updated. What you have there shows a single NCLocation
object that will be updated. The NCItems are just properties of
that object and will not be updated unless they also appear
directly in the updatedObjects() list. It might be clearer to log
out
ec.updatedObjects().valueForKey("entityName")
ec.insertedObjects().valueForKey("entityName")
I might not have been too clear in my comments, but I do understand
that the Items in the list are properties not individual objects to
be updated. But that kind of takes me back to my original
question . . . Is attempting to store this object (and a few others
like it) with all of its to-many relationship properties what is
bogging down the app?
Maybe.
Is it an obvious, no brainer, yes, or should WO be able to handle a
few million properties like this? Seems a bit ridiculous to be
asking, but hey, what do I know?
A few million? That would probably cause some performance issues,
yes. Do you ever really need all few million of the properties?
Will you need to know all the items stored in a location? Or just
the (for exmaple) undelivered ones? Or the ones for a specific Company?
It appears I need to streamline my relationships first, then when
I'm building the bulk of the app try to figure out what
relationships I can't live without. It was easy to create the
relationships in the first place; it should be just as easy to put
them back. Then I'll have to try and figure out how to pull off
what Anjo was talking about.
Maybe Anjo will post or point you to some code. Or I can try and
whip it out if I find time later.
Thanks much to everyone for the help. Let me know if anyone hears a
good rule of thumb for what does and does not need a relationship.
I have nothing to add to that.
Chuck
--
Coming in 2006 - an introduction to web applications using WebObjects
and Xcode http://www.global-village.net/wointro
Practical WebObjects - for developers who want to increase their
overall knowledge of WebObjects or who are trying to solve specific
problems. http://www.global-village.net/products/practical_webobjects
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden