Re: Normalize an NSAttributedString
Re: Normalize an NSAttributedString
- Subject: Re: Normalize an NSAttributedString
- From: Michael Ash <email@hidden>
- Date: Wed, 26 Aug 2009 14:45:19 -0400
On Wed, Aug 26, 2009 at 1:21 PM, Ken Thomases<email@hidden> wrote:
> On Aug 26, 2009, at 10:43 AM, Michael Ash wrote:
>
>> On Wed, Aug 26, 2009 at 5:42 AM, Ken Thomases<email@hidden> wrote:
>>>
>>> On Aug 25, 2009, at 7:21 PM, Ross Carter wrote:
>>>
>>>>> I haven't tried it, but this should work:
>>>>>
>>>>> NSAttributedString* original = whatever;
>>>>> NSMutableAttributedString* normalized = [[original mutableCopy]
>>>>> autorelease];
>>>>> CFMutableStringRef str = (CFMutableStringRef)[original
>>>>> mutableString];
>>>>> CFStringNormalize(str, kCFStringNormalizationFormD);
>>>>>
>>>>> This works because -[NSMutableAttributedString mutableString] is a
>>>>> proxy
>>>>> that automatically fixes up the attribute runs held by its owner.
>>>
>>> Hmm, this seems dangerous in the sense that the conversion may be lossy.
>>> As
>>> far as I can see, there's no guarantee that CFStringNormalize will
>>> perform
>>> minimal replacements. If it does not, then whole ranges of characters
>>> may
>>> have their attributes reset to that of the first replaced character.
>>>
>>> Even if testing reveals it to be non-lossy under one testing environment,
>>> without a guarantee that might differ under any other testing
>>> environment.
>>
>> http://en.wikipedia.org/wiki/Unicode_equivalence
>>
>> [... quote snipped ...]
>
> I'm well aware of what it means. The question is, which exact operations on
> the mutable string proxy does CFStringNormalize perform. If
> CFStringNormalize performs the minimal replace operations to get the result,
> then it will preserve the attributes closely. It's conceivable, though,
> that CFStringNormalize uses a side buffer to compute the normalized form and
> then does one big replace of the whole mutable string's range. Or, anywhere
> in between. Like, it might replace a series of precomposed characters with
> their decompositions all with one replace operation. In that case, the
> attributes of most of the characters will be lost (replaced with the
> attributes of the first character in the replace range).
>
> So, it's clear that the _strings_ will always have a deterministic value as
> a result of normalization. That's the point of normalization. But the
> _attributed strings_ may not.
Fair enough. However, as Douglas pointed out, you aren't guaranteed
consistent results if you have multiple attributes within a single
decomposed character range *anyway*, so you're going to have trouble
regardless. Better to avoid that situation altogether.
Mike
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden