re:Core Data binary data optimization(s)
re:Core Data binary data optimization(s)
- Subject: re:Core Data binary data optimization(s)
- From: Ben Trumbull <email@hidden>
- Date: Mon, 15 Dec 2008 23:30:18 -0800
This was a bit slow, as Core Data seemed to fault each point object
individually, resulting in a lot of query overhead.
The performance chapter in the Core Data Programming Guide does
address this specifically.
but Core Data faults my objects so often that now I spend even more
time unarchiving
each polygon's point data than it took to read each point's row in
from the database!
Core Data is doing what you asked it to do. If there's a lot of
faulting, then either you need to prefetch as described in the
programming guide, or you are throwing away results too quickly.
Instrument's Core Data template can help you find where your code
instigates excessive faulting. The stack back trace for events in
Instruments is a great way to find issues. Simply walk up the back
trace into you get to a frame in your code, and consider why it's
triggering the work underneath it.
If after you've gather Instruments data you believe your code is doing
the right thing, that would be worth a trip to bugreport.apple.com
I've thought about caching the result of my "get all polygons" fetch
to speed up redrawing, plus further optimizing my archiving code to
help initial load and final save speed
Each executeFetchRequest is an I/O. Core Data expects you to cache
the results. This may be how a misalignment of expectations has
occurred. If you want a hash table, Foundation offers an excellent
general purpose one, and Core Foundation provides extensive
customization in its dictionary callbacks.
This seems a little icky, though. Does Core Data provide a cleaner
way of
efficiently keeping a fetch result up-to-date?
There's no "live" fetch request. That would make an excellent
enhancement request at bugreport.apple.com
If I model a (prefetched) to-one relationship to the
polygon's binary data instead of an attribute, and [moc
refreshObject:binaryData mergeChanges:NO] once unarchived, will this
make it fairly likely I'll only have my unarchived copy of the point
array data in memory?
Yes, for large binary datas that is the canonical solution. If it's
very small, then it may not be worth the effort.
For flat primitive data like this, is it possible to get performance
closer to a raw push of
the data to/from disk with Core Data?
You could write your own NSValueTransformer that doesn't need to copy/
transmogrify the bytes. Double scalar bytes are architecture
dependent, though, so that may not be useful. However, for a long
sequence of doubles, even with a copy, you can probably do a lot
better than NSKeyedArchiver of an NSArray of NSNumber. doubles are 8
bytes and NSNumber 16, so you'll have to put some effort into failing
to do 2x better.
In theory, if the binary data were large enough, I'd suggest storing
it as a memory mapped file, and just keeping the URL in the database.
But the lower bound on that is in the 16-32KB range (e.g. at some
point it's worse to make a separate file, and that point is > 8K)
The marketing number for raw push speed of the disk is for large
quantities of sequential I/O. And the OS is optimized to hell and
back for it. You won't see anything remotely like that with a
relational database, even high end commercial client server ones. You
can get closer with special purpose streaming APIs for dedicated
BLOBs. On the flip side, good luck asking the file system for a non-
path query in the millisecond response range. How many seconds/
minutes does it take to search by file metadata or file contents ?
Compare that to a database query, and the FS gets crushed.
Databases and file systems are intended to solve different problems.
A full text index like Spotlight or SearchKit is intended to solve yet
another problem. And something like memcached yet another. The world
has a lot of problems.
Coincidentally, those sequential I/O chunks are in the MB+ range, and
work quite neatly for memory mapped files trick ...
- Ben
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden