Re: Core Data performance advice... creating relationships.
Re: Core Data performance advice... creating relationships.
- Subject: Re: Core Data performance advice... creating relationships.
- From: Chris Hanson <email@hidden>
- Date: Tue, 15 Jan 2008 01:56:43 -0800
On Jan 14, 2008, at 1:43 PM, Martin Linklater wrote:
I have created all my entities and am in the process of creatingp
the relationships. I have two entities, each with around 400,000
entries each. 'Foo' and 'Bar' are the two entities. They reference
each other using a common 'ID' integer.
What is this "common 'ID' integer" -- is it a critical part of the
model for your data, or is it something that you just thought you
should put in due to your experience with other frameworks?
If it's something you can possibly avoid having in your data model, do
so. Core Data's relationship management handles things like object
IDs for you. Maintaining your own parallel IDs is just duplicating
work, in a way that's almost guaranteed to be sub-optimal.
I have created a one to many relationship from Foo to Bar (rel),
along with the corresponding inverse.
'Foo" <------>> 'Bar'
My algorithm for creating these relationships is to fetch every
entry in 'Foo', and enumerate through the resulting array building a
fetch request for 'Bar' in the form of 'All entities in Bar where ID
== x'. Then when I get that result, I set 'Foo.rel' to the NSArray
returned by that fetch request.
That fetch request has to perform a full table scan for the instances
(not "entities") of Bar whose ID property equals x, because unless
you've told it to do so in your data model (and you're running
Leopard), it won't know to create an index on that property.
Furthermore, that table scan has to be against both data cached in
memory *and* against the SQLite database on disk, in case any other
users of that SQLite database (in the same process or in a different
one) changed data that would match the query.
My first instinct is that, instead of importing your data by first
creating all instances and then establishing all relationships between
them, you established relationships as you created instances. If you
need to improve the performance of that, it's sometimes possible to do
so by keeping a cache of the instances you've already created so you
can relate things to them without constantly issuing fetch requests
that scan the database.
As an example -- and these numbers depend on your data -- imagine
keeping references to the last 100 instances your import process
created in a dictionary, keyed by the ID you mentioned above. As you
create a Bar instance, you can relate it instantly if its
corresponding Foo is in the cache; if not, you can pull its Foo into
the cache, or (if it doesn't exist) create a placeholder Foo that will
be populated with real data at the appropriate point in the import.
Can you explain what your data model is in slightly more concrete
terms than you have so far? I think that'll ultimately help clarify a
lot.
-- Chris
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden