Re: Number of chars
Re: Number of chars
- Subject: Re: Number of chars
- From: email@hidden
- Date: Fri, 22 Mar 2013 14:28:47 +0900
To add to this a bit, you will want to consider normalizing you strings before and after conversion.
This gives you some sense of consistency with regard to composed/decomposed characters.
There are various normalized forms and NSString provides convenient methods.
Choosing a normalized form or normalizing at all really requires thought. Consider whether or not your user needs the original form ( which may be a mix and may need to remain the same)
The impact is how you actually count the string length.
Another impact could be hashing or checksum values.
The answer is it depends.
On 2013/03/22, at 10:12, Aki Inoue <email@hidden> wrote:
>
> On Mar 21, 2013, at 6:05 PM, Andrew Thompson <email@hidden> wrote:
>
>>
>>
>> On Mar 21, 2013, at 2:10 PM, Aki Inoue <email@hidden> wrote:
>>
>>> For that matter, UTF-32 (aka UCS-4) is not safe to find the truncation boundary just at the 4-byte boundary.
>>
>> You're thinking of combining marks here?
> Yes.
>
>> It's generally claimed that one can multiply character offsets by 4 to index into UCS-4 data… which I think I now see is only true depending on your definition of character; i.e whether one considers a decomposed sequence to be one character or two.
>
>> I see how truncation would be unsafe because you'd chop off the accents etc?
> Yes.
>
> Aki
>
>
>
> _______________________________________________
>
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden