Re: wchar_t and printf not working
Re: wchar_t and printf not working
- Subject: Re: wchar_t and printf not working
- From: "Michael B Allen" <email@hidden>
- Date: Mon, 28 Mar 2005 23:13:39 -0500 (EST)
- Importance: Normal
Clark Cox said:
> On Mon, 28 Mar 2005 14:09:11 -0500, Michael B Allen <email@hidden>
> wrote:
>> On Mon, 28 Mar 2005 08:39:25 -0500
>> Clark Cox <email@hidden> wrote:
>>
>> > Actually, you will *never* see UTF-8 with more than 4 octets per
>> > codepoint. Period. That is the way that UTF-8 is defined. If you see a
>> > 5 or 6 octet character, then you are not reading UTF-8 data.
>>
>> This is incorrect. Please read the first sentence in section 2 of
>> RFC 2279
>
> RFC 2279 has been obsoleted by RFC 3629 for years now:
> http://www.ietf.org/rfc/rfc3629.txt
First, this is dated 16 months ago. Pardon me for googling "utf-8 rfc" but
don't be a twit - it's not "years".
> From said RFC:
> [snip]
> 3. UTF-8 definition
>
> UTF-8 is defined by the Unicode Standard [UNICODE]. Descriptions and
> formulae can also be found in Annex D of ISO/IEC 10646-1 [ISO.10646]
>
> In UTF-8, characters from the U+0000..U+10FFFF range (the UTF-16
> accessible range) are encoded using sequences of 1 to 4 octets.
This just says that range takes 4 octects. Well yeah. Duh. As I stated
previously is the only practical range of values one would ever see but
technically UTF-8 can encode 6 bytes. RFC 3269 mentions this in the
Security Considerations section:
Another security issue occurs when encoding to UTF-8: the ISO/IEC
10646 description of UTF-8 allows encoding character numbers up to
U+7FFFFFFF, yielding sequences of up to 6 bytes. There is therefore
a risk of buffer overflow if the range of character numbers is not
explicitly limited to U+10FFFF or if buffer sizing doesn't take into
account the possibility of 5- and 6-byte sequences.
Man, it really doesn't pay to be pedantic these days.
EOT
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden