Re: Normalize an NSAttributedString
Re: Normalize an NSAttributedString
- Subject: Re: Normalize an NSAttributedString
- From: Ken Thomases <email@hidden>
- Date: Wed, 26 Aug 2009 12:21:03 -0500
On Aug 26, 2009, at 10:43 AM, Michael Ash wrote:
On Wed, Aug 26, 2009 at 5:42 AM, Ken Thomases<email@hidden>
wrote:
On Aug 25, 2009, at 7:21 PM, Ross Carter wrote:
I haven't tried it, but this should work:
NSAttributedString* original = whatever;
NSMutableAttributedString* normalized = [[original
mutableCopy]
autorelease];
CFMutableStringRef str = (CFMutableStringRef)[original
mutableString];
CFStringNormalize(str, kCFStringNormalizationFormD);
This works because -[NSMutableAttributedString mutableString] is
a proxy
that automatically fixes up the attribute runs held by its owner.
Hmm, this seems dangerous in the sense that the conversion may be
lossy. As
far as I can see, there's no guarantee that CFStringNormalize will
perform
minimal replacements. If it does not, then whole ranges of
characters may
have their attributes reset to that of the first replaced character.
Even if testing reveals it to be non-lossy under one testing
environment,
without a guarantee that might differ under any other testing
environment.
http://en.wikipedia.org/wiki/Unicode_equivalence
[... quote snipped ...]
I'm well aware of what it means. The question is, which exact
operations on the mutable string proxy does CFStringNormalize
perform. If CFStringNormalize performs the minimal replace operations
to get the result, then it will preserve the attributes closely. It's
conceivable, though, that CFStringNormalize uses a side buffer to
compute the normalized form and then does one big replace of the whole
mutable string's range. Or, anywhere in between. Like, it might
replace a series of precomposed characters with their decompositions
all with one replace operation. In that case, the attributes of most
of the characters will be lost (replaced with the attributes of the
first character in the replace range).
So, it's clear that the _strings_ will always have a deterministic
value as a result of normalization. That's the point of
normalization. But the _attributed strings_ may not.
Also, it should be self-evident that normalizing to a precomposed
form will
obliterate attribute differences between a base character and any
combining
characters, as discussed elsewhere in this thread.
Good thing he went and normalized to a *de*composed form then, isn't
it?
Martin's example used Form D, but Ross never quite said that's what he
was normalizing to. He might have been adapting Martin's example but
using a different form.
Regards,
Ken
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden