• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
basic Core Data scaling question
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

basic Core Data scaling question


  • Subject: basic Core Data scaling question
  • From: Michael B Johnson <email@hidden>
  • Date: Mon, 1 Sep 2008 18:52:17 -0700

quick question:

Let's say I have 100,000 ManagedObjects of type A. Each has a one-to- one relationship to a ManagedObject of type B, which has a reciprocal one-to-many relationship with all the As.

Assuming I have all 100,000 As around (I've just created them in the ManagedStore) - what's the most efficient way to set up the relationships between all those As and B?

'cause doing the things that seem obvious to me (either looping over the As setting their B or setting B to point to the collection of As) is taking way, way too freakin' long...

--------------------------
longer background:

So I've dabbled with Core Data the past few years, but it never really mapped well on to the kinds of apps I've been writing. Recently, though, I finally have an application that I started writing that I think maps really well on to it, but I'm having some initial scaling problems that I'm trying to understand.


Loosely, here's the scenario:

You have a Project.
Each Project has some set of Albums and Artists.
Each Artist makes some set of Images, many of which get collected in one more more Albums.
Each Image can have some set of ImageVersions, but most only have 1 or 2.


For a given Project, you'll probably have 50 or so Artists, 100 or so Albums, 80,000 or so Images, and a total of 150,000 or so ImageVersions.

Eventually, I expect to have dozens of Projects, maybe more, so that the eventually database of ImageVersions would be in the low millions.

Other than the actual image data (and its corresponding proxy and thumbnails), all the data you need to keep around is pretty simple, and maps nicely on to CoreData (strings, dates, Integers of various sizes, etc.)

But to start reasonable (but non-trivial), let's take a Project that has 91K ImageVersions of 80K Images. There 162 Albums and 29 Artists. I have a simple .csv file with all the info in it, and I iterate over it to build up an array of dictionaries of all the info.

Then taking that array of dictionaries (building those from the 91K line csv takes a few 10s of seconds), I then start iterating over them, making the appropriate MangedObjects. I first pull out the Project(s) from the file (there's only 1 in this example, but there could be multiple), and then I make the Artist and Album objects. For each of the Artist and Album objects I find the Project instance and wire them up.

All of that runs at a reasonable speed.

The problem comes when I start adding the Images to the managed store. I time how long it takes to add 100 at a time. The first 100 go in 0.022 seconds, but by the time I've inserted 4,200 of them, it's taking 1 second/100, at 20,000 it's taking 6sec/100, and by the time I'm up to 90,000, it's taking over 20sec/100. It literally takes hours to chew through.

After much spelunking, I've found that it's when I set the project relationship on the Image that is taking up all the time. If I bring in all the Projects, Artists, Albums and Images without wiring up the Images to the Project (although I do do it for the Albums), the whole thing runs in about 30 seconds.

This must be a common idiom - what idiotic thing am I doing wrong?

Thanks for any help.


--> Michael B. Johnson, PhD --> http://homepage.mac.com/drwave (personal) --> http://xenia.media.mit.edu/~wave (alum) --> MPG Lead --> Pixar Animation Studios




_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: basic Core Data scaling question
      • From: Quincey Morris <email@hidden>
  • Prev by Date: Re: Core Data: Instantiating linked entities
  • Next by Date: Re: Objective-C and AppleScript
  • Previous by thread: Re: HFS Path To POSIX Path
  • Next by thread: Re: basic Core Data scaling question
  • Index(es):
    • Date
    • Thread