Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Core Data dog-slow when using first time after boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Core Data dog-slow when using first time after boot

Subject: Re: Core Data dog-slow when using first time after boot
From: "Melissa J. Turner" <email@hidden>
Date: Thu, 20 Aug 2009 13:38:22 -0700


On Aug 20, 2009, at 02:35, Ruotger Skupin wrote:

Complex locale aware Unicode text queries can be slow. If you find yourself spending time with such a query, you should consider some of the techniques shown in the DerivedProperty example available on ADC.
Isn't all text Unicode?

No. Not all apps are Unicode based, and many of the ones that aren't will put things on the pasteboard quite happily. The web (and thus anything copied out of a web browser) is definitely not all Unicode, especially the older pages. And even within Unicode there are multiple encoding formats (8. 16, and 32 bit). In addition to the varying encoding sizes, Unicode also has multiple ways to represent conceptual characters. Characters that have diacritics for example, can be represented as either one Unichar ('é') or two ('´' + 'e').

I don't understand. This shouldn't be a special case. But I will have a look at the sample.

In my case I'd guess that at least half of the objects contain unicode strings (international names and addresses). What I want to say: write anything in German or French and you end up with Unicode.

Due to the multiplicity of representations, text comparisons in Unicode can be slow, since instead of just doing a byte by byte comparison, you end needing to calculate character sizes, check for compositions/decompositions, check for analogues between different symbol systems used to represent a single language (ie kana and kanjii), recognize and drop punctuation, etc. For apps that do repeated comparisons against a set of strings, it can be worth it to preprocess all strings into one canonical format to minimize the amount of work that needs to be done during a comparison (make all strings UTF8/16/32, make all characters lowercase, strip all diacritics or ensure characters that have them are always in either their composed or decomposed forms, etc) and then use a less expensive collation for the comparison.

As a side node, if you want to use regular expressions on Unicode strings, you generally need to do the normalization anyway, since regex languages operate at the Unichar level rather than at the conceptual character level.

+Melissa

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >re: Core Data dog-slow when using first time after boot (From: Ben Trumbull <email@hidden>)
  >Re: Core Data dog-slow when using first time after boot (From: Ruotger Skupin <email@hidden>)




Prev by Date:
Re: When do I need to override hash?

Next by Date:
Re: When do I need to override hash?

Previous by thread:
Re: Core Data dog-slow when using first time after boot

Next by thread:
Re: Core Data dog-slow when using first time after boot

Index(es):

Date
Thread