Re: swprintf (Xcode issue)
Re: swprintf (Xcode issue)
- Subject: Re: swprintf (Xcode issue)
- From: glenn andreas <email@hidden>
- Date: Mon, 28 May 2007 14:32:04 -0500
On May 28, 2007, at 1:50 PM, Alexander von Below wrote:
I fully subscribe to Andy Finnell's rant on his blog:
>> Mac OS X, being a Unix variant, had implemented wchar_t as
UTF-32! All the cross platform code, code that used to work on
Windows and Mac, no longer worked. Apple felt their pain, and
issued this technical note, which essentially says: “instead of
using wchar_t, which used to be cross platform before we destroyed
it, use CFStringRef, which is not cross platform, has never been,
and never will be. P.S. This is really your own fault for ever
using wchar_t. Suckers.” <<
http://www.losingfight.com/blog/2006/07/28/wchar_t-unsafe-at-any-size/
Alex
I prefer to follow what the Unicode standards say about this.
There is no implicit encoding on wchar_t (beyond requiring zero
extension for ASCII character - there's not even a guarantee that it
has at least 16 bits - it could be a single byte). <http://
unicode.org/versions/Unicode4.0.0/ch05.pdf> explicitly states
"programs that need to be portable across any C or C++ compiler
should not use wchar_t for storing Unicode text" :
With the wchar_t wide character type, ANSI/ISO C provides for
inclusion of fixed-
width, wide characters. ANSI/ISO C leaves the semantics of the wide
character set to the
specific implementation but requires that the characters from the
portable C execution set
correspond to their wide character equivalents by zero extension. The
Unicode characters
in the ASCII range U+0020 to U+007E satisfy these conditions. Thus,
if an implementation
uses ASCII to code the portable C execution set, the use of the
Unicode character set for the
wchar_t type, in either UTF-16 or UTF-32 form, fulfills the requirement.
The width of wchar_t is compiler-specific and can be as small as 8
bits. Consequently,
programs that need to be portable across any C or C++ compiler should
not use wchar_t
for storing Unicode text. The wchar_t type is intended for storing
compiler-defined wide
characters, which may be Unicode characters in some compilers.
However, programmers
who want a UTF-16 implementation can use a macro or typedef (for
example, UNICHAR)
that can be compiled as unsigned short or wchar_t depending on the
target com-
piler and platform. Other programmers who want a UTF-32
implementation can use a
macro or typedef that might be compiled as unsigned int or wchar_t,
depending on
the target compiler and platform. This choice enables correct
compilation on different
platforms and compilers. Where a 16-bit implementation of wchar_t is
guaranteed, such
macros or typedefs may be predefined (for example, TCHAR on the Win32
API).
On systems where the native character type or wchar_t is implemented
as a 32-bit quan-
tity, an implementation may use the UTF-32 form to represent Unicode
characters.
A limitation of the ISO/ANSI C model is its assumption that
characters can always be pro-
cessed in isolation. Implementations that choose to go beyond the ISO/
ANSI C model may
find it useful to mix widths within their APIs. For example, an
implementation may have a
32-bit wchar_t and process strings in any of the UTF-8, UTF-16, or
UTF-32 forms.
Another implementation may have a 16-bit wchar_t and process strings
as UTF-8 or
UTF-16, but have additional APIs that process individual characters
as UTF-32 or deal with
pairs of UTF-16 code units
More semantics can be found in <http://unicode.org/reports/tr17/>.
Glenn Andreas email@hidden
<http://www.gandreas.com/> wicked fun!
quadrium2 | build, mutate, evolve, animate | images, textures,
fractals, art
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden