• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Xcode-users Digest, Vol 7, Issue 153
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xcode-users Digest, Vol 7, Issue 153


  • Subject: Re: Xcode-users Digest, Vol 7, Issue 153
  • From: Lehel Bernadt <email@hidden>
  • Date: Wed, 14 Apr 2010 11:20:30 +0200


Well here is my take on this issue:

The type wchar_t is an internal multibyte character representation for the compiler that is optimised for string operations on a specific platform. It can be of any size, and it doesn't need to be mapped to any existing encoding.

So a wchar_t array is not a "string" in the conventional sense.
Basically the concept is that at runtime, you need to do
(string with a specific encoding) <---> wchar_t representation
conversions back and forth whenever you want to use string ops on your real strings. Also it's not wise to store wchar_t string literals in your program unless they're ASCII strings. Store it in UTF-8 or whatever encoding you like (that doesn't use null chars) in a C string, then at runtime convert it to wchar_t, as all the other strings you get as input, do your string operations, and then when you need to display it, convert it to a char* string encoded according to the locale.


All of this means that there's a sharp divide between real-world strings using whatever encoding and wchar_t strings.
The positive side of this is that there is no limitation for the compiler... if implemented right, it should be no problem for example changing the size to 64 bit for a 64 bit platform, if string comparisons would be faster that way, or defining it as 8 bit for embedded systems. It's not a problem if UTF-16 is succeeded by UTF-32 or UTF-64 or whatever, since we are encoding agnostic.
The negative side is that you need to convert *every single time*. For example you cannot do this:
if(wc == L'ő') ... in an UTF-8 encoded source
First you need to convert 'ő' to wchar_t, and only then can you compare... which is pretty cumbersome. Because C doesn't have encapsulation and the possibility of using syntactic sugar to simplify all these ops like in the case of OO languages, it's also not entirely obvious why you can do c = 'a' with a char but not wc = 'ű' with a wchar_t.


As you can see TCHAR is "on the other side of the fence" compared to wchar_t, because it represents real-life strings using a specific encoding. There is no support for this type under POSIX, only null terminated C strings (which includes the possibility of using UTF-8), and using wchar_t during runtime.

Regards,
Lehel
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Re: Xcode-users Digest, Vol 7, Issue 153 (From: "email@hidden" <email@hidden>)
 >Re: Xcode-users Digest, Vol 7, Issue 153 (From: "Clark S. Cox III" <email@hidden>)
 >Re: Xcode-users Digest, Vol 7, Issue 153 (From: "Paul Sanders" <email@hidden>)
 >Re: Xcode-users Digest, Vol 7, Issue 153 (From: Clark Cox <email@hidden>)
 >Re: Xcode-users Digest, Vol 7, Issue 153 (From: "Paul Sanders" <email@hidden>)

  • Prev by Date: Re: Remote debugging using Xcode
  • Next by Date: Re: Remote debugging using Xcode
  • Previous by thread: Re: Xcode-users Digest, Vol 7, Issue 153
  • Next by thread: Re: Xcode-users Digest, Vol 7, Issue 153
  • Index(es):
    • Date
    • Thread