Re: CoreData & importing a large amount of data
Re: CoreData & importing a large amount of data
- Subject: Re: CoreData & importing a large amount of data
- From: Chris Hanson <email@hidden>
- Date: Wed, 19 Oct 2005 13:31:28 -0700
On Oct 19, 2005, at 11:21 AM, Dominik Paulmichl wrote:
For testing and development purposes I use an XML data store. So I
know that Core Data makes in memory searches.
Even when I save each new entry the Mac ran very fast out of
memory. :-(
How can I avoid this??
There are a number of things you're doing in your code that you can
do differently to improve both memory and time performance.
First of all, you're creating an awful lot of autoreleased objects.
You can use an inner autorelease pool that you release and re-create
every few times through your main loop to put a bound on how many
additional objects you add to the outer pool. Also, you can create
objects using +alloc/-init and then -release them when you're done to
avoid putting them in an autorelease pool in the first place.
Next, you're not just creating a predicate every time through you're
loop, you're parsing one. You can create a predicate once that
includes a variable, and then just get a predicate with a value
substituted for that variable each time you actually need to use it
in a fetch.
Also, when doing model object-level work like this, you shouldn't be
using your array controller to perform the fetches. Controllers are
for managing the interaction between your model objects and your
human interface. At the model object level, you should just be
asking your managed object context to perform the fetches directly.
(Then you can also choose to avoid sorting overhead if it doesn't
matter, etc.)
Finally, probably the most significant thing you're doing is
following a "find-or-create" pattern, where you set up some data to
create, check to see if it's already been created, and then create it
if it hasn't been created already. This is generally *not* a pattern
you want to follow when importing data, because it turns an O(n)
problem into an O(n^2) problem.
It's much better -- when possible -- to just create everything "flat"
in one pass, and then fix up the relationships in a second pass. For
example, if you're importing data and you know you won't have any
duplicates (say because your initial data set is empty) you can just
create a bunch of managed objects to represent your data and not do
any searches at all. Or if you're importing "flat" data with no
relationships, you can just create managed objects for the entire set
you're importing then and weed out (delete) any duplicates before
save using a single large IN predicate.
If you do need to follow a find-or-create pattern -- say because
you're importing heterogeneous data where relationship information is
mixed in with attribute information -- you'll be much better off if
you introduce a cache. You can just use an NSMutableDictionary or
CFMutableDictionaryRef for this purpose, using the criteria you're
finding on as the key. Check to see if the object you're looking for
is in the dictionary; if it isn't, then do a fetch. If something is
either found or if you create it then save it in the cache for the
next time it's looked up. And of course you can get rid of your
cache when you're done with the import.
-- Chris
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden