Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: CoreData async fetch request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CoreData async fetch request

Subject: Re: CoreData async fetch request
From: David Melgar <email@hidden>
Date: Thu, 8 Oct 2009 23:28:11 -0400

I read a little on ICU and now understand that sqlite by default does not handle case insensitive unicode. Is there an easy way to make sqlite use ICU on the Mac, or do I have to build it myself with ICU enabled?

Based on the derived property example, it seems that I would need to duplicate any text fields I commonly search on where I want to support case insensitive queries.

How does spotlight manage to index so much text and respond quickly? Are those queries can insensitive?

I've done a bunch more performance tests. I don't understand some of the differences. I hadn't realized that coredata in the data model has a flag to index a field. When I did that for Coredata, it dramatically sped it up. Although adding an index to my sqlite test slowed it down.

Coredata, without an index, predicate "name like foo", 20 databases 400k records each. 2m17s. Coredata with an index, predicate "name like foo", 20 databases 1mil records each, 25 seconds. SQLite no index, sql "where name like foo�r", 20 databases 1mil records each 11 seconds. SQLite index, sql "where name like foo�r", 20 databases 1 mil records each 30 seconds.

As you said, with an index, and testing for equality, both coredata and sqlite queries responded in <1 sec. SQLite without an index was 9 seconds for same query.

I can only presume SQLite slowed down with an index because the database file was physically much larger and took longer to read from disk.

On Oct 8, 2009, at 2:25 AM, Ben Trumbull wrote:

On Oct 7, 2009, at 10:12 PM, David Melgar wrote:
Hello, I didn't mean to state threads as a requirement when I said "async", I just meant some way to get partial results, such as a call to a delegate I referenced in the previous note. And I'm certainly not seeking complexity of threads if I don't need it.
There isn't a simple way to do this. There is, however, a simple way to make your queries (in both Core Data, and direct SQLite) 100x faster.

In my case, == is not the query I need. I really want case insensitive match as I presumed LIKE would do, or better yet, 'LIKE foo%'.
Great. The Derived Property sample project on ADC shows you exactly what you need to do for replacing LIKE "foo" or LIKE "foo%"

In the meantime, I've just completed a test using SQLite. The query in Coredata using 'LIKE foo�r' across roughly 8 million records didn't return anything until the query was done, and the query took 2min 14sec,
This runs through ICU, and cannot use an index. It can trivially be made 100x faster, at least for equality, prefix, and suffix matching.
What's the actual SQL Core Data logs ?
Doing what I interpreted to be equivalent using SQLite across 20 million records returned in 11 seconds
What was the actual SQL you believed equivalent ?
and I was able to get my first result almost immediately without needing threads. They provide a callback scheme, analogous to calling a delegate in Cocoa. FYI, this is across 20 different databases, otherwise it may have been faster.

For what I'm doing, SQLite seems much faster, smaller footprint and easier to comprehend. I don't understand what I'm losing.

In your note you mention the cost of doing unicode aware regex and sorting. As far as I know, I don't need any of those, at least not yet.
All of your data is 100% 7 bit ASCII ? None of your customers will be Asian, European, or Hispanic ? None of your English customers will expect Unicode data that they copy and paste into your application from Safari to be preserved correctly ?

If your answer is no, then you'll need to implement your own string matching operations for SQLite using either CFString or ICU. SQLite's built in operators use memcmp(). Once you've done that, I'd love to see the performance results.

I took a brief look at the derived property example. Seems complex.
What about the example seems complex ? Everything you need is in 1 file, that's got 107 lines of code. You can just copy and paste that code. Make a searchText column that has preprocessed the text data into a simpler form that can be used with an index and a simpler query.

You say you want to do a case insensitive search across 20 million rows. You could absorb the cost of that, relative to a simple binary bitwise compare, for each of the 20 million rows. And pay it again every time you execute a query. Or, you could store a preprocessed column, and absorb that cost for just 1 string, the current search term "foo"

Perhaps for ASCII case insensitivity, that doesn't seem particularly interesting. But once you start dealing with real world text encodings, which every Mac OS X and iPhone OS customer expects from every app, this becomes a big deal. Unicode is complex and difficult to work with.

Something the framework should give me an option to handle rather than me having to generate such complex code. Does it advocate creating another field in the database as a normalized version of the field I really want to search on?
Yes.
Why would this be better than using SQLite directly?
Doing this in SQLite directly will also be 100x faster than what you're doing now. Regardless of whether or not you decide to use Core Data, this would be better.
- Ben


_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: CoreData async fetch request
From: Ben Trumbull <email@hidden>


References:  
  >Re: re: CoreData async fetch request (From: email@hidden)
  >Re: CoreData async fetch request (From: Ben Trumbull <email@hidden>)
  >Re: CoreData async fetch request (From: David Melgar <email@hidden>)
  >Re: CoreData async fetch request (From: Ben Trumbull <email@hidden>)
  >Re: CoreData async fetch request (From: Ben Trumbull <email@hidden>)




Prev by Date:
Re: Alert prompts for BetterAuthorizationSample

Next by Date:
Re: Automatic language detection - a bug (again) or what. Snow	Foundation NSSpellServer.

Previous by thread:
Re: CoreData async fetch request

Next by thread:
Re: CoreData async fetch request

Index(es):

Date
Thread