Re[2]: printf functions fail with non-ascii characters
Re[2]: printf functions fail with non-ascii characters
- Subject: Re[2]: printf functions fail with non-ascii characters
- From: Peter Mulholland <email@hidden>
- Date: Thu, 6 Sep 2007 09:01:26 +0100
Hello Alastair,
Wednesday, September 5, 2007, 9:53:12 PM, you wrote:
> The wide character API is a legacy part of the C standard; it was
> never designed to be used with Unicode. This isn't really ANSI's
> fault, because (AFAIK) it predates Unicode, and in particular the
> decision that Unicode really would need more than 65536 code points
> (which creates problems if you use UTF-16 wchar_ts).
True, it would be a problem where wchar_t wouldn't neccesarily be able
to hold 1 character, but they COULD have made it work well enough for
90% of cases. It's a shame something like libunicode didn't take off.
> ANSI *could* adopt a Unicode API for the C language, but given the
> complexity of doing so and the widespread availability of existing
> library code for dealing with Unicode, it would be rather redundant.
They should do so if they want Unicode to be adopted in a portable
fashion. Currently, there is no portable way to do it.
> That isn't true. If by "everyone else" you mean Microsoft (which
> seems to be a very common "everyone else", particularly on Apple
> mailing lists), then you should be aware that some Microsoft software
> only supports UCS-2, not UTF-16 (i.e. they still don't support
> surrogates everywhere; SQL Server, for instance, doesn't work right
> in every instance).
I'm not talking about software, and Microsoft have had Unicode in
their API since Windows 2000. They were also smart enough to make
their C widechar stuff work with the Win32 Unicode stuff - unlike
Apple. I have to do EXTRA work if I want to mix the two.. not so on
Windows.
> If by "everyone else", you were including other Unixen or Linux
> (and you should), then it's also not true, because on most of those
> platforms wchar_t is indeed 32-bits in size.
Probably because they don't want to follow Microsoft's lead!
UTF-32 is an unnecessary waste of space.
> Using the wchar routines is often a mistake anyway. You can't
> guarantee that a wchar_t will necessarily contain a Unicode code
> point value, because some systems (particularly in the Far East)
> provide other wide character systems (e.g. some variant of JIS or
> Big5).
Very rarely do you want to parse just ONE char though, usually you are
printing strings. For this, the wchar routines COULD be workable.
Sure, you couldn't guarantee that wchar_t would hold one character,
but for most cases it would work just fine - a lot better than it does
now!
> In practice, if you want good handling of Unicode characters, you
> either (a) code it yourself, referring copiously to the Unicode book,
> or (b) use a library like CoreFoundation or ICU. (b) is an *awful
> lot* less effort...
Basically what you're saying is, a) do a lot of hard work, or b) make
your code non-portable. Great. I say again, this would be a lot less of
an issue, if Apple had made wchar use UTF-16 in accordance with their
own CoreFoundation routines.
--
Best regards,
Peter mailto:email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden