Re: swprintf fails with extended character codes
Re: swprintf fails with extended character codes
- Subject: Re: swprintf fails with extended character codes
- From: Lehel Bernadt <email@hidden>
- Date: Wed, 14 Apr 2010 13:24:15 +0200
Hi,
On 04/14/2010 11:02 AM, Ben Staveley-Taylor wrote:
First, my apologies for not editing the subject line in my last post. I
have reverted back to the original title.
Knowing now how unportable and vaguely specified wchar_t is, I realise
that we should have avoided it all those years ago when we did our
Unicode Windows codebase. Things are always clearer in hindsight!
Using wchar_t is absolutely portable, you just have to understand the
concept behind it ;)
I don't like the idea of just treating TCHAR as char* and using UTF8
strings. It will work in many cases such as the simple printf show
below, but I have past experience of working with multibyte strings and
it is an absolute nightmare. Certain of the C standard string functions
just don't work reliably.
A standard C string means one character is one byte. If you use it to
handle UTF-8 encoded strings then the traditional string functions won't
work indeed.
I know we have code like this (I'm removing
our TCHAR-esque typedefs for simplicity):
char *pos = strrchr(path, '/');
and that won't respect multibyte characters.
Because this is not a multibyte function. First you have to convert it to
wchar_t from your encoding, e.g. with iconv, and then use
wcsrchr(converted_path, '/')
Maybe C lib functions like
that can be made to work with a suitable locale setting, but in a large
team someone is always going to write code that iterates through a
string character by character using a "char *p = <string>; while (*p++)
{ ... }" idiom and that's going to go wrong with UTF8 encoding.
Yes, switching to multibyte string handling is a considerable effort and
change in the way of programming... you can't use char* anymore.
Also -- in an effort to be cross-platform -- we use the STL and Boost
string algorithms library quite a lot:
std::string str("Hello world");
if (boost::algorithm::istarts_with(str, "Hello)) { ... }
and I'm pretty sure the underlying Boost::Range mechanics require that
you can iterate a string solely by knowing the size of its character
type so would probably go wrong with UFT8 in strings. (I'm not 100%
certain of that, and I know this is not the place for a Boost discussion.)
Anyway, I understand my choices now. I have filed a bugreporter issue
but even if it gets addressed it's not going to be a solution for my
current project of course.
Thanks.
Ben Staveley-Taylor
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden