Re: wchar_t and printf not working
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com User-agent: Microsoft-Entourage/11.1.0.040913 On 28.03.2005 06:43, "Michael B Allen" <mba2000@ioplex.com> wrote:
For Darwin, or just about any other Unix based OS, you want to use the non-wide character mode.
Agreed. Just don't define XML_UNICODE and XML_UNICODE_WCHAR_T, and XML_Char becomes an UTF-8 encoded char.
Note that there are some minor gotchas to look out for when working with UTF-8 though. For example you cannot necessarily iterate over each character by simply examining each element in the array.
Actually, that's not much of a drawback. You cannot index into UTF-16 strings, either - a single code point can be encoded as a surrogate pair in UTF-16.
Each character may occupy between 1 and 6 bytes [1].
More precisely, between 1 and 4: <http://www.unicode.org/faq/utf_bom.html#30>.
Could it be that expat is assuming wchar_t is 2 bytes instead of the 4 bytes of darwin running on powerpc?
I *think* Expat's wchar_t is hardcoded at 2 bytes (UTF-16LE) period.
The byte order is certainly configurable (WORDS_BIGENDIAN in expat_config.h), and I *think* that expat happily stores UTF-16 in 4-byte wchars - after all, it would take them extra effort to break this. - WBR, Alexey Proskuryakov _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com
participants (1)
-
Alexey Proskuryakov