Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness

Subject: Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
From: Christopher Nebel <email@hidden>
Date: Wed, 18 Jun 2008 13:39:59 -0700

On Jun 18, 2008, at 12:24 PM, Ken Thomases wrote:

On Jun 18, 2008, at 1:49 PM, JongAm Park wrote:
Can anyone tell me why the two different data source are displayed as same "자연", while what it contains are different?
I haven't looked into the specific character sequences in-depth, but I suspect the difference is in Normalization Forms. Specifically, form C vs. D.
http://unicode.org/reports/tr15/
The idea is that the same character can be obtained from a single code point or by several combining code points.

In Cocoa, see -precomposedStringWithCanonicalMapping and - decomposedStringWithCanonicalMapping.

Sure looks like it, based on the data. EC 9E 90 is U+C790, "자"; E1 84 8C E1 85 A1 is U+110C "ᄌ", U+1161 "ᅡ", which is the decomposed version of the same thing. -[NSString fileSystemRepresentation] may also be of use here, given that this is really a file path -- the normalization form used for file names is dictated by the file system.


--Chris Nebel
AppleScript Engineering

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >[Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness (From: JongAm Park <email@hidden>)
  >Re: [Q] UTF-8	stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding	weirdness (From: Ken Thomases <email@hidden>)




Prev by Date:
Re: unexpected nil outlet

Next by Date:
Re: Working if Cocoa Core Data

Previous by thread:
Re: [Q] UTF-8	stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding	weirdness

Next by thread:
Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness

Index(es):

Date
Thread