Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re[2]: printf functions fail with non-ascii characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re[2]: printf functions fail with non-ascii characters

Subject: Re[2]: printf functions fail with non-ascii characters
From: Peter Mulholland <email@hidden>
Date: Thu, 6 Sep 2007 09:01:26 +0100

Hello Alastair,

Wednesday, September 5, 2007, 9:53:12 PM, you wrote:

> The wide character API is a legacy part of the C standard; it was
> never designed to be used with Unicode.  This isn't really ANSI's
> fault, because (AFAIK) it predates Unicode, and in particular the
> decision that Unicode really would need more than 65536 code points
> (which creates problems if you use UTF-16 wchar_ts).

True, it would be a problem where wchar_t wouldn't neccesarily be able
to hold 1 character, but they COULD have made it work well enough for
90% of cases. It's a shame something like libunicode didn't take off.

> ANSI *could* adopt a Unicode API for the C language, but given the
> complexity of doing so and the widespread availability of existing
> library code for dealing with Unicode, it would be rather redundant.

They should do so if they want Unicode to be adopted in a portable
fashion. Currently, there is no portable way to do it.

> That isn't true.  If by "everyone else" you mean Microsoft (which
> seems to be a very common "everyone else", particularly on Apple
> mailing lists), then you should be aware that some Microsoft software
> only supports UCS-2, not UTF-16 (i.e. they still don't support
> surrogates everywhere; SQL Server, for instance, doesn't work right
> in every instance).

I'm not talking about software, and Microsoft have had Unicode in
their API since Windows 2000. They were also smart enough to make
their C widechar stuff work with the Win32 Unicode stuff - unlike
Apple. I have to do EXTRA work if I want to mix the two.. not so on
Windows.

> If by "everyone else", you were including other Unixen or Linux
> (and you should), then it's also not true, because on most of those
> platforms wchar_t is indeed 32-bits in size.

Probably because they don't want to follow Microsoft's lead!
UTF-32 is an unnecessary waste of space.

> Using the wchar routines is often a mistake anyway.  You can't
> guarantee that a wchar_t will necessarily contain a Unicode code
> point value, because some systems (particularly in the Far East)
> provide other wide character systems (e.g. some variant of JIS or
> Big5).

Very rarely do you want to parse just ONE char though, usually you are
printing strings. For this, the wchar routines COULD be workable.
Sure, you couldn't guarantee that wchar_t would hold one character,
but for most cases it would work just fine - a lot better than it does
now!

> In practice, if you want good handling of Unicode characters, you
> either (a) code it yourself, referring copiously to the Unicode book,
> or (b) use a library like CoreFoundation or ICU.  (b) is an *awful
> lot* less effort...

Basically what you're saying is, a) do a lot of hard work, or b) make
your code non-portable. Great. I say again, this would be a lot less of
an issue, if Apple had made wchar use UTF-16 in accordance with their
own CoreFoundation routines.

--
Best regards,
 Peter                            mailto:email@hidden

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

Follow-Ups:
- Re: printf functions fail with non-ascii characters
  - From: Alastair Houghton <email@hidden>

References:
	>printf functions fail with non-ascii characters (From: "William H. Schultz" <email@hidden>)
	>Re: printf functions fail with non-ascii characters (From: Peter Mulholland <email@hidden>)
	>Re: printf functions fail with non-ascii characters (From: Alastair Houghton <email@hidden>)

Prev by Date: Re: -mlong-branch SOLVED
Next by Date: Re: printf functions fail with non-ascii characters
Previous by thread: Re: printf functions fail with non-ascii characters
Next by thread: Re: printf functions fail with non-ascii characters
Index(es):
- Date
- Thread