> That is required by the Standard. There is nothing in the C standard
that guarantees that wchar_t uses UTF-32, it could use any other encoding. As
such, it *must* be interpreted by locale.
I don't agree with this comment. If swprintf can
assume that a wchar_t is wider than 8 bits, it can safely ignore the current
locale.
>
As specified by the C and C++ standards, wchar_t is largely useless. I would
recommend avoiding it at all costs.
If, as I have, you replace what I consider to be a broken
swprintf implementation with something that treats wchar_t transparently,
wchar_t works just fine, and gives me (and the OP) what we seek - the ability to
re-use our existing TCHAR-based Windows code on the Mac. This comment
applies to the entire wide vprintf and fprintf family of course, but this
is not hard to cobble together. wfprintf is of limited utility anyway
- UTF-32 encoded files are very rare.
Another issue is bridging the gap between wchar_t strings and
NSStrings, but I have a couple of utility routines to do that. Anyone
who wants these is welcome to a copy.
> FYI, the conversion between UTF-16 and UTF-32 has never
been a 1:1 mapping. You're thinking of UCS-2 (which is identical to UTF-16
except for the surrogate pairs, and is often mistakenly called UTF-16).
Once upon a time, there were no surrogate pairs.
That's what I meant. And I wanted to inform / remind any interested
parties of their existence. For Windows programmers, where all the
WIN32 APIs assume that wchar_t is 16 bits, they are a regrettable but necessary
hack. But I guess I was a little sloppy in my use of terminology
there.
Paul Sanders - 'focussing on the practical'.
|