• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: UTF8 question.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF8 question.


  • Subject: Re: UTF8 question.
  • From: Andrew Thompson <email@hidden>
  • Date: Wed, 24 Aug 2005 22:15:44 -0400


On Aug 24, 2005, at 5:52 AM, Chris Ridd wrote:

wide char values
Win: C0 30 (little endian)

OK, you're encoding U+30C0 (KATAKANA LETTER DA)

Mac: 30 BF 30 99 (big endian)

This becomes KATAKANA LETTER TA + COMBINING KATAKANA-HIRAGANA VOICED SOUND
MARK.


In other words, the conversion is splitting the original character into two
decomposed (is that the right term?) pieces.

And that's a valid decomposition. The katakana syllable ta (~{%?~}) can indeed be combined with a mark to make da~{!!#(%@#)~}.
If you're not familiar with Japanese, perhaps a French example will be more understandable.
In Unicode, an e with an acute accent (~{(&~}) can be represented as a single character, e-acute, or as a plain e followed by a combining acute.


So I'm not sure I see a problem - this is two different valid ways of representing the same glyph.
See here: http://www.unicode.org/faq/char_combmark.html


I believe there's an algorithm to get from one form to the other (decomposed form or not).
So there's probably an API call on at least one of the platforms that can do the conversion.


As others have said - please be careful - UTF-8 is not the only commonly encountered Unicode encoding, so please also double check that the functions you're calling really do give UTF-8 and not UCS-2, UCS-4 or UTF-16.

AndyT (lordpixel - the cat who walks through walls)
A little bigger on the inside

        (see you later space cowboy ...)

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Re: UTF8 question. (From: Chris Ridd <email@hidden>)

  • Prev by Date: KVO/Bindings problem in 10.3 works in 10.4
  • Next by Date: Re: firstResponder and multiple NSTextFields
  • Previous by thread: Re: UTF8 question.
  • Next by thread: Webview reloading is blocked ?
  • Index(es):
    • Date
    • Thread