Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: CoreData database sharing and migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CoreData database sharing and migration

Subject: Re: CoreData database sharing and migration
From: Ben Trumbull <email@hidden>
Date: Tue, 23 Mar 2010 13:26:01 -0700

On Mar 22, 2010, at 4:06 AM, Steve Steinitz wrote:

> On 18/3/10, Ben Trumbull wrote:
>
>> there wasn't a good solution for multiple machines simultaneously
>> accessing an SQLite db file (or most other incrementally updated
>> binary file formats).  By "good" I mean a solution that worked
>> reliably and didn't make other more important things work less well.
>
> I'm curious about the reliability issues you saw.  Also, by less well do you mean slower?

Because the different NFS clients have file caches with differing amounts of staleness, and the SQLite db is updated incrementally, it's possible for an NFS client to think it has the latest blocks, and then derive material from one and write it into another (it is after all a b-tree).  The written blocks have implicit dependencies on all the other active blocks in the database, so having stale data is bad.

>> For nearly all Mac OS X customers (sadly not you) achieving a near
>> 100x performance boost when accessing database files on an AFP or SMB
>> mount (like their home directory in educational deployments) is pretty
>> huge.
>
> I agree.  But wouldn't those same educational institutions be prime candidates for multiple machine access?

No.  The restriction is multiple physical machines using the same database files simultaneously (open).  While AFP will allow multiple logins to the same account when configured with an advanced setting, in general, AFP servers actually prevent users from multiple simultaneous logins to the same account.

>> To address both sets of problems on all network FS, we enforce a
>> single exclusive lock on the server for the duration the application
>> has the database open.  Closing the database connection (or logging
>> out) allows another machine to take its turn.
>
> Could my application close the database connection and re-open it to work around the problem?  How would I do that?  I suppose once I got it going I'd have to retry saves, fetches etc.

In theory, one could, but in practice that won't be very manageable without architectural changes.  Something along the lines of open the remote connection, pull down interesting data and save it to a local file, and close the remote connection.  Given your current deployment set up provides adequate performance, and all you need is a bug fix, I'm not sure this would be very helpful.

>> You'll get the 10.5 performance characteristics, however.
>
> Again, the 10.5 performance over gigabit ethernet is almost unbelievably fast.  I may know why.  Despite your helpful explanations I'm still not exactly clear on the relationship between caching and locking, but I wonder if the speed we are seeing is helped by the fact that the entire database fits into the Synology NAS's 128meg cache?

That probably doesn't hurt.

> In another message in this thread, you made a tantalizing statement:
>
>> Each machine can have its own database and they can share their results
>> with NSDistributedNotification or some other IPC/networking protocol. You can hook into the NSManagedObjectContextDidSaveNotification to track
>> when one of the peers has committed changes locally.
>
> Let me guess how that would work: before saving, the peer would create a notification containing the inserted, updated and deleted objects as recorded in the MOC.  The receiving machine would attempt to make those changes on its own database.  Some questions:
>
>    Would that really be feasible?

yes, but as you observe, it's more tractable for simple data records than complex graphs with common merge conflicts

>    Would it be a problem that each machine would have having different primary keys?

yes

>    How would the receiving machine identify the local objects that changed remotely?

typically this is done with a UUID & a fetch.  Since each database on each client is different, the stores themselves have different UUIDs, and any encoded NSManagedObjectID URI will note which store the objectID came from, so you could also map them the objectID URIs to the local value directly.

>    Could relationships (indeed the object graph itself) feasibly be maintained?

yes.  Updates to to-many relationships require the use of a (additions, subtractions) pair instead of simply setting the new contents.

>    How would relationships between the remote objects be identified?  Hand-parsing?

Either by UUID or objectID URI.

>    Has anyone done it that you know of?

yes.  I'm aware of 4 solutions, however, I would only recommend 1 as appropriate for the general (skill, time, pain threshold) and it avoids complex relationship graphs.  Basic data record replication over DO.

>    Is there sample code?

no.  The only real trick in converting the didSave notification into something appropriate for DO to consume is to copy it and replace the NSManagedObjects with a dictionary that has a UUID instead of an object ID, the attribute values, and the relationship contents expressed as a list of UUIDs.

- Ben

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References:
	>re: CoreData database sharing and migration (From: Steve Steinitz <email@hidden>)

Prev by Date: NSTextStorage insertion point
Next by Date: Remove/Disable NSPredicate from NSMutableSet?
Previous by thread: re: CoreData database sharing and migration
Next by thread: Threads Question
Index(es):
- Date
- Thread