Re: re: CoreData async fetch request
Re: re: CoreData async fetch request
- Subject: Re: re: CoreData async fetch request
- From: email@hidden
- Date: Tue, 06 Oct 2009 02:00:44 +0000
I am doing a simple query search for a text string pattern (ie 'SELF like
foo') on ~10 million small records stored persistently using sqlite. This
is a performance test to make sure I get reasonable performance from my
database engine before I commit too much code to it.
The query is taking over 3 minutes with a small result set. This is on a
new 13" macbook pro w 4gb memory.
The query is taking too long for a user to sit and wait for it. Is there a
way to speed it up? Can indexing be applied to it?
I had thought if I could display results as they are found that might be
reasonable. In my tests, if I use setFetchBatchSize and setOffset to
restart it, then it ends up repeating the query taking that many times
longer to get a result. Not reasonable. It does not seem to start the query
where it left off, as a database cursor would do.
My impression is that my usage scenario is not an appropriate use of core
data.
I was planning to try SQLite directly. Would it be more appropriate?
Thanks
On Oct 5, 2009 7:14pm, Ben Trumbull <email@hidden> wrote:
Is there a way to do an asynchronous fetch request against Core data
returning partial results?
That depends on whether it's the query part that's expensive (eg WHERE
clause with complex text searching and table scans) or simply the
quantity of the row data that's your problem. For the latter, you can
just use -setFetchBatchSize: and be done.
You can use a separate MOC on a background thread to perform asynchronous
work. You can then pass over results to the main thread to display to the
user. However, unless your search terms are very expensive, it's usually
easier and faster to use -setFetchBatchSize: synchronously. For well
indexed queries, it can handle a million or two rows per second. Not sure
why you'd subject your users to that kind of experience. It's common to
use fetch limits, count requests, and only show the top N results. What's
your user going to do with a hundred thousand results anyway ?
If you need to attack the computational expense of your query terms,
that's more complicated. Obviously it would be best to optimize the
queries and ensure they are using an index. But if that's not enough, you
can execute the queries in a background MOC, fetching objectIDs + row
data (put in the the row cache) and then have the other MOC materialize
the objects by ID from the row cache. There's a BackgroundFetching
example in /Developer/Examples/CoreData. It shows how to do this.
Returning partial results incrementally would require some creativity on
your part to subdivide the query into several. Since most expensive
queries are text searches, it's usually possible to subdivide the result
set naturally. Like the first letter of 'title'. Similar to the thumb bar
index on the side of the Contacts app on the iPhone.
There's also a DerivedProperty example on ADC for optimizing text queries.
Obviously, Apple's own Spotlight could not use something like
Coredata, since it heavily relies on returning asynchronous partial
results.
Which is neither here nor there. Most Cocoa applications wouldn't want
Spotlight to be the sole persistence back end of their data. The latency
of putting all your data in a full text index instead of a relational
database or keyed archive would be pretty absurd. Now, if you're writing
an app that's primarily structured around full text searching, you might
instead prefer to focus on putting your data in Spotlight via small
files, and using the Spotlight APIs. But it's not suitable for apps
interested in an OOP view of their data.
Frankly, this is my second application I've attempted to use Coredata
to find it come up surprisingly short. The first time the issue was
core data not being thread safe.
Core Data can be used efficiently with multiple threads. It might help to
think of each MOC as a separate writeable view. If you'd like to know
more, you can search the archives for my posts.
What is the target market for Core Data? Why sort of application is
ideal for its use? What size data store? Right now it escapes me.
Cocoa and Cocoa Touch applications, particularly done in an MVC style
with an OO perspective on their data. Some people also use it as a
persistent cache for data stored in another canonical format, such as XML
files. On the Mac side, we've had customers with 3+ million rows (multi
GB) databases, and on the embedded side, roughly 400,000 rows (100s MB).
However, it does take some care and feeding to handle data sets like
that, and most developers find it straight forward up to about 10% those
numbers.
It sounds like you're having performance issues. What kinds of queries
are you trying to accomplish ? How much data are you working with ? How
have you modeled your primary entities?
You can fetch back just NSManagedObjectIDs, and
-setIncludesPropertyValues: to NO to effectively create your own cursors
if you prefer.
- Ben
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden