re: CoreData fetch is way too slow
re: CoreData fetch is way too slow
- Subject: re: CoreData fetch is way too slow
- From: Ben Trumbull <email@hidden>
- Date: Tue, 5 Feb 2008 16:07:02 -0800
James, Mike,
The tools on Leopard should help identify your performance problems.
Shark is my perennial favorite, but with the Core Data template,
Instruments is very handy.
For quick spot checking, you can use the user defaults:
-com.apple.CoreData.SyntaxColoredLogging 1 -com.apple.CoreData.SQLDebug 1
For reference, if I search a 4+ million row database with no
predicate and a limit of 1 million, sql logging shows:
2008-02-05 14:48:27.973 CoreDataFetchTool[6916:10b] CoreData: sql:
SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZLETTER, t0.ZTEXT, t0.ZLENGTH FROM
ZWORD t0 LIMIT 1000000
2008-02-05 14:48:28.818 CoreDataFetchTool[6916:10b] CoreData:
annotation: sql connection fetch time: 0.8225s
2008-02-05 14:48:29.029 CoreDataFetchTool[6916:10b] CoreData:
annotation: total fetch execution time: 1.0554s for 1000000 rows.
Fetched 1000000 items at 913297 rows per second.
and for a limit of 1:
2008-02-05 14:48:40.764 CoreDataFetchTool[6926:10b] CoreData: sql:
SELECT Z_VERSION, Z_UUID, Z_PLIST FROM Z_METADATA
2008-02-05 14:48:40.767 CoreDataFetchTool[6926:10b] CoreData: sql:
SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZLETTER, t0.ZTEXT, t0.ZLENGTH FROM
ZWORD t0 LIMIT 1
2008-02-05 14:48:40.768 CoreDataFetchTool[6926:10b] CoreData:
annotation: sql connection fetch time: 0.0007s
2008-02-05 14:48:40.768 CoreDataFetchTool[6926:10b] CoreData:
annotation: total fetch execution time: 0.0013s for 1 rows.
Fetched 1 items at 67 rows per second.
As your fetch for 1 object is 5,000x slower than no predicate (on a
much larger database, no less), I think we can hazard a guess that
your predicate involves significant complexity.
Thanks to Phil for handy reference to an earlier post describing how
indexing works with Core Data at
<http://www.cocoabuilder.com/archive/message/cocoa/2008/1/16/196552>
An arbitrary LIKE query is extremely expensive. If done correctly
with locale and Unicode support, it will be slow no matter what
storage technology you adopt. Databases typically do not support
indices for pattern matching queries. So this query is doing a full
table scan. That's what a full text search engine like Lucene,
SearchKit or Spotlight is for (although many databases now offer some
integration with FTS).
There's only one way to make performing a locale & unicode friendly
arbitrary LIKE operations on every row in a table faster: Don't.
Many developers write arbitrary LIKE queries when all they really
need are prefix or suffix matching. You can use a beginsWith or
endsWith predicate instead. As those predicates still support
unicode matching, they're only somewhat faster than a full LIKE.
To really speed things up, you need to go with locale & unicode
hostile searching. If you use the arithmetic predicate operators
(=, >, <, etc) Core Data will treat the column as binary data, ditch
all the unicode handling, and make your search eligible for indices.
This is typically done by denormalizing the data and having "display
text" and "search text" columns. Depending on the fidelity your
application requires, you can also do "fast lossy query on search
text" AND "slow unicode query on display text" which will at least
prune the number of rows evaluated by the slow query.
A prefix search can be rewritten as "attribute >= x and attribute <
y" A suffix search can be rewritten similarly using a column with
the characters reversed.
When you supplement this approach with an index (binary collation) to
the attribute, you should be able to get this ~100x faster.
It's possible that your application's usage patterns work better
loading everything into memory up front. You can do that with Core
Data too.
re: Mike's caching commonly used managed objects.
That's what we expect done. Core Data is not a big dictionary.
Foundation and Core Foundation already have a number of perfectly
good collection APIs, and you can cache in an intelligent,
application specific manner. And NSArray and NSSet support direct
filtering with predicates. It's not clear what would make things
easier for you.
If you have a problem you think is generally applicable, please file
a feature request. Core Data does not assume that your process is
the only process using the database. But if enough developers wanted
an option to facilitate that assumption, we could do that.
If you file performance bugs, you can really help us understand what
queries are most in need of improvement. If you include a sample
project (source) with a representative data set, we can make a
detailed investigation. Martin (from previous thread cited above)
was kind enough to do so, and those issues inspired some performance
improvements that will benefit everyone.
--
-Ben
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden