Re: Encoding of "long string"
Re: Encoding of "long string"
- Subject: Re: Encoding of "long string"
- From: "Shawn Erickson" <email@hidden>
- Date: Wed, 12 Dec 2007 08:51:54 -0800
On Dec 12, 2007 8:32 AM, Steve Checkoway <email@hidden> wrote:
>
> On Dec 11, 2007, at 10:09 PM, Chris Espinosa wrote:
>
> > Basically, it's ISO 10646 in UCS-4, which is generally coherent with
> > UTF-16 for the subset that is defined in ISO 10646. For promotion
> > of 7-bit ASCII characters, which is what you'd usually find in
> > source, it will be more than adequate.
>
> Do you mean UTF-32 there? <http://en.wikipedia.org/wiki/UTF-32/UCS-4>.
If you are referring to his use of UTF-16 in the above I would say no
(well possibly)... I believe Chris' point maybe that UCS-4/UTF-32
overlaps UTF-16. In other words if you took a UCS-4 code point and
strip off the upper 2 bytes you get a valid UTF-16 code point (at
least for a subset of UCS-4/UTF-32).
Anyway L"blah" on Windows is giving you UCS-2 (aka a 2 byte string)
and not UTF-16. On Unix systems L"blah" gives you UCS-4 (aka UTF-32).
So if you attempt to pass these strings across platforms you will have
a problems if you don't account for the differences. You will have to
convert the strings to a portable external form (UTF-8 for example if
you want to avoid byte order issues, etc.) before passing them between
platforms.
-Shawn
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden