Re: wchar_t and printf not working
Re: wchar_t and printf not working
- Subject: Re: wchar_t and printf not working
- From: Clark Cox <email@hidden>
- Date: Mon, 28 Mar 2005 22:36:33 -0500
On Mon, 28 Mar 2005 14:09:11 -0500, Michael B Allen <email@hidden> wrote:
> On Mon, 28 Mar 2005 08:39:25 -0500
> Clark Cox <email@hidden> wrote:
>
> > Actually, you will *never* see UTF-8 with more than 4 octets per
> > codepoint. Period. That is the way that UTF-8 is defined. If you see a
> > 5 or 6 octet character, then you are not reading UTF-8 data.
>
> This is incorrect. Please read the first sentence in section 2 of
> RFC 2279
RFC 2279 has been obsoleted by RFC 3629 for years now:
http://www.ietf.org/rfc/rfc3629.txt
>From said RFC:
[snip]
3. UTF-8 definition
UTF-8 is defined by the Unicode Standard [UNICODE]. Descriptions and
formulae can also be found in Annex D of ISO/IEC 10646-1 [ISO.10646]
In UTF-8, characters from the U+0000..U+10FFFF range (the UTF-16
accessible range) are encoded using sequences of 1 to 4 octets.
[snip]
Also, note that the RFC is *not* authoritative. From the same RFC:
[snip]
NOTE -- The authoritative definition of UTF-8 is in [UNICODE]. This
grammar is believed to describe the same thing Unicode describes, but
does not claim to be authoritative. Implementors are urged to rely
on the authoritative source, rather than on this ABNF.
[snip]
--
Clark S. Cox III
email@hidden
http://www.livejournal.com/users/clarkcox3/
http://homepage.mac.com/clarkcox3/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden