Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]

Subject: Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]
From: Ben Trumbull <email@hidden>
Date: Thu, 15 Mar 2007 23:58:35 -0700

Pierre, et al,

The "nested transaction" error is an error you'll get exclusively in the event of a mistake with threads.

We strongly discourage people from sharing managed objects or managed object contexts between threads. Sharing these objects safely is more trouble than it's worth. objectIDs are immutable and can be shared safely.

Each NSManagedObjectContext has its own (duplicate) NSManagedObject for any given row in the database. In this way each MOC can have completely separate sets of changes, perform undo/redo separately, save distinct subsets of changes, etc.

An NSManagedObjectContext facilitates a level of isolation and transactional scope that is independent of every other NSManagedObjectContext.

Although each NSManagedObject instance is a distinct object between NSManagedObjectContexts, the actual data from the database is shared copy-on-write. So, for most apps, there is not a lot of memory to be saved by sharing managed objects anyway.

If you keep managed objects and their contexts private to each thread, and use -objectRegisteredForID:, -objectWithID:, or -executeFetchRequest: to "copy" managed objects between threads, you'll find yourself living in a much more maintainable and comprehensible threading model.

In addition, the software will be much more scalable as you won't need to lock around these objects all the time, being private to a thread.

Each thread can operate independently on its own copy of the NSManagedObject. Their own little world, running at full speed and without concern for each other. The typical communication points are -executeFetchRequest: and -save: (with merge policies, versioning, and conflict resolution)

It then becomes Core Data's problem to make its internal use of other objects thread safe and to perform the copy-on-write sharing efficiently.

The most common mistake with using the Core Data endorsed threading model is an application level race condition between one thread inserting a new object, telling another thread about it (safely via objectID), and then saving the new object. There's a window in which the second thread is very confused and may throw exceptions during faulting over a not-yet-saved object.

The second most common mistake is a similar race condition where one thread deletes objects out from underneath another thread that has an objectID but not yet faulted in the data.

-my application fetchs some objects from the store, and store them in a NSDictionary for fast access. It usually does not load the whole store, only a part of it.

Remember that in order to access (read or write) any part of an NSManagedObject, the thread must own the lock of the MOC. This means you need to be extremely careful with KVC and collection objects working with NSManagedObjects shared between threads. KVC and dictionaries don't know anything about the thread safety requirements for CoreData objects.

A common mistake is to not lock around valueForKey: sent to an NSManagedObject. Core Data does not have any "read only" operations, as any operation may have side effects on caching and faulting.

All this trouble goes away if you don't share managed objects between threads.

In general, it is not appropriate for threads to share state at such a fine granularity as an individual property (ivar).

-I update those NSManagedObjects with common KVO. This is in a multi-thread context

KeyValueObserving is generally not appropriate for notifications between threads. The reason for this is that the granularity of the notifications is so small that the communication costs and thread safety issues overwhelm any benefit. Notifications between objects "owned" by separate threads are very difficult to get right; doubly so in a language without garbage collection.

Instead of multi-threaded KVO notifications, -performSelectorOnMainThread... or message passing integration with the NSRunLoop are much easier to get right. If efficiency is a problem, simply make each message carry more information until you have the correct granularity. If you have a lot of experience, I'm a fan of (pthread,etc) condition variables. But performing selectors and run loops are good places to start.

I've done a lot of multi-threaded programming over the last decade on a wide range of hardware. Locks can be very fast when only 1 thread uses them, but they degrade exponentially as the active threads contending over that lock increase. For example, on a 4 core machine, 4 threads contending over a specific lock run more than 10x slower than 1 thread. I've seen more complicated scenarios with the 4 core, 4 thread contention scenario runs over 500x slower. Exponential degradation is not pretty.

Scalable software minimizes contention and maximizes coordination between threads.

This is a truth that flows from the hardware itself. Data isolation is often the easiest and most efficient way to achieve both goals. Data isolation is also a pattern that is relatively more tractable for humans to debug.

When I say data isolation, I mean divvying up responsibilities between threads so that either they do not need to change the same data, or they each have their own copy of the (hopefully small) amount of mutable data that must conceptually overlap. Data isolation philosophy is "The best solution is to not have the problem"

Locks are shared mutable data, and therefore as much a part of the problem as they are the solution.

You can supplement this with other patterns. But they're all just variations on minimizing contention and maximizing coordination in creative ways. <http://www.cise.ufl.edu/research/ParallelPatterns/publications.htm> is a reference to a book (hardcover) that has some interesting ideas.

Three rules of thumb that will help the efficiency of both your engineering performance and your software's performance:

(1) Communicate meaningfully, (2) Communicate only at checkpoints, (3) Assertions

(1) Communication between threads should occur at a large granularity. It's important that the amount of work accomplished as a result of the communication is larger than the cost of the communication. Threads should not share partial object state (i.e. an ivar instead of the entire object).

(2) Arbitrary inter-thread notification systems are usually undesirable. Properly done, they are a lot of work and overhead. The receiver of the notification needs to queue & defer the notification until it's safe to process in the context of the thread owning the receiver. This queue & defer pattern typically integrates with something that looks like a run loop or a checkpoint system that polls the queue.

(3) Your threading model should be something that can be audited by code. You should be able to add assertions about which thread owns any given resource, and whether the threading model has been violated. If you cannot even articulate what the assertions should be, how are you going to know if it's working ?

You can see that multi-threaded KVO violates all these rules simultaneously.
We cannot break these rules, but only break ourselves upon them.

You'll save yourself a lot of grief by designing an architecture which has specific, countable, and tightly defined communication points between threads.

"countable" is crucial. If you cannot finitely enumerate the different points where your threads are supposed to communicate with each other, you're fubar.

- Ben

p.s. For the list: please do not send me any email about KVO & multi-threading. It cannot work, you'll learn from either my experience or your own, and I'm not interested in your personal pain threshold.

p.p.s. More locks do not magically make code thread safe. Thread safe is both protected and deadlock free. Scalable is both thread safe and faster in the presence of additional resources. _______________________________________________

Cocoa-dev mailing list (email@hidden)

Do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Core Data & threads [re: Ping: Look for hints for "nested	transaction" problem with Core Data]
From: Pierre Chatelier <email@hidden>


Prev by Date:
Re: True object size

Next by Date:
Re: Core Data & threads [re: Ping: Look for hints for "nested	transaction" problem with Core Data]

Previous by thread:
Re: Does willChange:valuesAtIndexes:forKey: work correctly?

Next by thread:
Re: Core Data & threads [re: Ping: Look for hints for "nested	transaction" problem with Core Data]

Index(es):

Date
Thread