Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]
Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]
- Subject: Core Data & threads [re: Ping: Look for hints for "nested transaction" problem with Core Data]
- From: Ben Trumbull <email@hidden>
- Date: Thu, 15 Mar 2007 23:58:35 -0700
Pierre, et al,
The "nested transaction" error is an error you'll get exclusively in
the event of a mistake with threads.
We strongly discourage people from sharing managed objects or managed
object contexts between threads. Sharing these objects safely is
more trouble than it's worth. objectIDs are immutable and can be
shared safely.
Each NSManagedObjectContext has its own (duplicate) NSManagedObject
for any given row in the database. In this way each MOC can have
completely separate sets of changes, perform undo/redo separately,
save distinct subsets of changes, etc.
An NSManagedObjectContext facilitates a level of isolation and
transactional scope that is independent of every other
NSManagedObjectContext.
Although each NSManagedObject instance is a distinct object between
NSManagedObjectContexts, the actual data from the database is shared
copy-on-write. So, for most apps, there is not a lot of memory to be
saved by sharing managed objects anyway.
If you keep managed objects and their contexts private to each
thread, and use -objectRegisteredForID:, -objectWithID:, or
-executeFetchRequest: to "copy" managed objects between threads,
you'll find yourself living in a much more maintainable and
comprehensible threading model.
In addition, the software will be much more scalable as you won't
need to lock around these objects all the time, being private to a
thread.
Each thread can operate independently on its own copy of the
NSManagedObject. Their own little world, running at full speed and
without concern for each other. The typical communication points are
-executeFetchRequest: and -save: (with merge policies, versioning,
and conflict resolution)
It then becomes Core Data's problem to make its internal use of other
objects thread safe and to perform the copy-on-write sharing
efficiently.
The most common mistake with using the Core Data endorsed threading
model is an application level race condition between one thread
inserting a new object, telling another thread about it (safely via
objectID), and then saving the new object. There's a window in which
the second thread is very confused and may throw exceptions during
faulting over a not-yet-saved object.
The second most common mistake is a similar race condition where one
thread deletes objects out from underneath another thread that has an
objectID but not yet faulted in the data.
-my application fetchs some objects from the store, and store them
in a NSDictionary
for fast access. It usually does not load the whole store, only a part of it.
Remember that in order to access (read or write) any part of an
NSManagedObject, the thread must own the lock of the MOC. This means
you need to be extremely careful with KVC and collection objects
working with NSManagedObjects shared between threads. KVC and
dictionaries don't know anything about the thread safety requirements
for CoreData objects.
A common mistake is to not lock around valueForKey: sent to an
NSManagedObject. Core Data does not have any "read only" operations,
as any operation may have side effects on caching and faulting.
All this trouble goes away if you don't share managed objects between threads.
In general, it is not appropriate for threads to share state at such
a fine granularity as an individual property (ivar).
-I update those NSManagedObjects with common KVO. This is in a
multi-thread context
KeyValueObserving is generally not appropriate for notifications
between threads. The reason for this is that the granularity of the
notifications is so small that the communication costs and thread
safety issues overwhelm any benefit. Notifications between objects
"owned" by separate threads are very difficult to get right; doubly
so in a language without garbage collection.
Instead of multi-threaded KVO notifications,
-performSelectorOnMainThread... or message passing integration with
the NSRunLoop are much easier to get right. If efficiency is a
problem, simply make each message carry more information until you
have the correct granularity. If you have a lot of experience, I'm a
fan of (pthread,etc) condition variables. But performing selectors
and run loops are good places to start.
I've done a lot of multi-threaded programming over the last decade on
a wide range of hardware. Locks can be very fast when only 1 thread
uses them, but they degrade exponentially as the active threads
contending over that lock increase. For example, on a 4 core
machine, 4 threads contending over a specific lock run more than 10x
slower than 1 thread. I've seen more complicated scenarios with the
4 core, 4 thread contention scenario runs over 500x slower.
Exponential degradation is not pretty.
Scalable software minimizes contention and maximizes coordination
between threads.
This is a truth that flows from the hardware itself. Data isolation
is often the easiest and most efficient way to achieve both goals.
Data isolation is also a pattern that is relatively more tractable
for humans to debug.
When I say data isolation, I mean divvying up responsibilities
between threads so that either they do not need to change the same
data, or they each have their own copy of the (hopefully small)
amount of mutable data that must conceptually overlap. Data
isolation philosophy is "The best solution is to not have the problem"
Locks are shared mutable data, and therefore as much a part of the
problem as they are the solution.
You can supplement this with other patterns. But they're all just
variations on minimizing contention and maximizing coordination in
creative ways.
<http://www.cise.ufl.edu/research/ParallelPatterns/publications.htm>
is a reference to a book (hardcover) that has some interesting ideas.
Three rules of thumb that will help the efficiency of both your
engineering performance and your software's performance:
(1) Communicate meaningfully, (2) Communicate only at checkpoints,
(3) Assertions
(1) Communication between threads should occur at a large
granularity. It's important that the amount of work accomplished as
a result of the communication is larger than the cost of the
communication. Threads should not share partial object state (i.e.
an ivar instead of the entire object).
(2) Arbitrary inter-thread notification systems are usually
undesirable. Properly done, they are a lot of work and overhead.
The receiver of the notification needs to queue & defer the
notification until it's safe to process in the context of the thread
owning the receiver. This queue & defer pattern typically integrates
with something that looks like a run loop or a checkpoint system that
polls the queue.
(3) Your threading model should be something that can be audited by
code. You should be able to add assertions about which thread owns
any given resource, and whether the threading model has been
violated. If you cannot even articulate what the assertions should
be, how are you going to know if it's working ?
You can see that multi-threaded KVO violates all these rules simultaneously.
We cannot break these rules, but only break ourselves upon them.
You'll save yourself a lot of grief by designing an architecture
which has specific, countable, and tightly defined communication
points between threads.
"countable" is crucial. If you cannot finitely enumerate the
different points where your threads are supposed to communicate with
each other, you're fubar.
- Ben
p.s. For the list: please do not send me any email about KVO &
multi-threading. It cannot work, you'll learn from either my
experience or your own, and I'm not interested in your personal pain
threshold.
p.p.s. More locks do not magically make code thread safe. Thread
safe is both protected and deadlock free. Scalable is both thread
safe and faster in the presence of additional resources.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden