Re: wchar_t and printf not working
Re: wchar_t and printf not working
- Subject: Re: wchar_t and printf not working
- From: Clark Cox <email@hidden>
- Date: Mon, 28 Mar 2005 08:39:25 -0500
On Mon, 28 Mar 2005 03:51:50 -0500, Michael B Allen <email@hidden> wrote:
> On Mon, 28 Mar 2005 11:36:10 +0400
> Alexey Proskuryakov <email@hidden> wrote:
>
> > > Each character may occupy between 1 and 6 bytes [1].
> >
> > More precisely, between 1 and 4:
> > <http://www.unicode.org/faq/utf_bom.html#30>.
>
> At risk of being pedantic this is just talking about how to convert a
> UTF-16 character into a UTF-8 one. Because UTF-16 with a surrogate can
> only represent 21 bits of the Unicode code space only 4 bytes is necessary
> to encode any character in UTF-8.
Unicode *only hase* 21 bits of code space, even UTF-32 only uses 21 bits.
> But UTF-8 can encode the full 31 bit code space which needs at most 6 bytes to be
> represented in UTF-8. But unless you're doing Klingon you'll never actually see more
> than 4.
Actually, you will *never* see UTF-8 with more than 4 octets per
codepoint. Period. That is the way that UTF-8 is defined. If you see a
5 or 6 octet character, then you are not reading UTF-8 data.
--
Clark S. Cox III
email@hidden
http://www.livejournal.com/users/clarkcox3/
http://homepage.mac.com/clarkcox3/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden