Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
- Subject: Re: iso-8859-1 over UTF8 (was: Re: cString deprecated!)
- From: "Clark S. Cox III" <email@hidden>
- Date: Wed, 04 Sep 2002 09:11:09 -0400
On 09/04/2002 09:03, "Allan Odgaard" <email@hidden> wrote:
>
On onsdag, sep 4, 2002, at 14:11 Europe/Copenhagen, Gregory Weston
>
wrote:
>
>
>>> No, it won't. Control codes are low ASCII values. [...]
>
>> I meant codes defined by the functions, e.g. printf-template [...]
>
> Ah, but as long as those codes are 7-bit values instead of 8-bit, it's
>
> still not
>
> a problem because UTF8 _is_ ASCII in the first 7 bits.
>
>
As I said in the original letter, these codes may appear from the
>
multi-byte coded characters. I.e. some >7 bit character is encoded as
>
2-3 characters, now is there any guarantee that byte 2 or 3 of this
>
sequence won't appear (to the only 8 bit aware program) as a control
>
code (as defined in my previous letter)?
Yes, every byte in a UTF-8 multibyte character has the high bit set for
this very reason. You are guaranteed that no byte of a multi-byte character
will have a value in the range 0-127.
--
http://homepage.mac.com/clarkcox3/
Clark S. Cox III
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.