Re: NSString and Unicode and Japanese character encodings
Re: NSString and Unicode and Japanese character encodings
- Subject: Re: NSString and Unicode and Japanese character encodings
- From: Aki Inoue <email@hidden>
- Date: Thu, 25 Aug 2005 13:32:48 -0700
Izidor,
Much to my big surprise I read on Ruby mailing list that Unicode is
not appropriate for general character encoding, in particular it
fails to support existing Japanese encodings, among other problems.
It is a common misconception that Unicode is not appropriate for
Japanese processing. The Unicode specification is designed to
provide inter-operability with existing character sets (both standard
and proprietary) including all encodings widely used in Japan. In
fact, the Consortium has been communicating with Japanese standard
bodies.
There are some pages describing this (and other) problems in
Japanese encodings:
<http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559>
<http://www.miraclelinux.com/english/technet/samba30/
iconv_issues.html>
Note the major issue described in those docs is that Microsoft's
proprietary extensions for Shift-JIS can be lost when mapped to
Unicode. NS/CFString does inherit the encoding semantics when asked
to convert with CP932 encoding.
My question is: do NSString and NSText cope with this? Is it safe
to use NSString and assume that using initWithData:encoding: and
then modifying that string (e.g. inserting something) and than
using dataUsingEncoding: will get me back the same characters if
encoding is Shift-JIS?
So, the answer is "yes, you might lose round-trip fidelity if using
CP932 converter."
Aki
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden