• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
[Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness


  • Subject: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
  • From: JongAm Park <email@hidden>
  • Date: Wed, 18 Jun 2008 11:49:02 -0700

Hello, all.

I found a very interesting and strange behaviour of the NSString:stringByAddingPercentEscapedUsingEncoding.

I got a UTF-8 string from a Final Cut Pro project file, which is exported as an XML.
There is a video clip named "자연", which means "Nature" in Korean.
And its pathurl is file://localhost/Users/young/Movies/자연.mov
The 자연 part is 자연.
So, it is percent escaped string.


So, I tried getting a UTF8 version of "자연" by issuing either of :

1. NSString *anUTF16String = [NSString stringWithString:@"자연"];
NSString *anUTF8String = [anUTF16String UTF8String];

or
2. NSString *anUTF8String = [NSString stringWithUTF8String:"자연"];

And they returned same data.

And, I tried making a percent escaped string by calling :
NSString *anUTF8PercentEscapedString = [anUTF8String stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];


And tried reverting back to original string by calling :
NSString *revertedUTF8String = [anUTF8PercentEscapedString stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];


This gave me the same original data to the one tried in either 1 or 2 above.
And I checked what data it contains by calling :

3. char *revertedCStringOne = (char *)[revertedUTF8String cStringUsingEncoding:NSUTF8StringEncoding];

It was : EC 9E 90 EC 97 B0

As I mentioned above, the pathurl string of FCP project looks different from the result 3.
So, I tried converting the Korean part of the pathurl by calling :


char test[] ={ 0xE1, 0x84, 0x8C, 0xE1, 0x85, 0xA1, 0xE1, 0x84, 0x8B, 0xE1, 0x85, 0xA7, 0xE1, 0x86, 0xAB, 0};
length = strlen( test );
for( i = 0; i < length; i++ )
{
NSLog(@"%X", test[i] );
}
printf("\n");


// 4. It prints the same "자연"
NSString *questionedString = [NSString stringWithUTF8String:test];
NSLog(@"Questioned String = %@", questionedString );

and.. when the questionedString is converted to a percent escaped string by calling :
NSString *questionedPercentEscapedString = [questionedString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSLog(@"%@", questionedPercentEscapedString);


It was same to the one in the FCP project pathurl, ie. ᄌ

Can anyone tell me why the two different data source are displayed as same "자연", while what it contains are different?
I would like to send an Apple event to the Final Cut Pro, but I'm not sure if it is OK to send the percent escaped one like 1 or 2, or the one in the FCP project. ( I don't know how to generate the one like in the FCP project XML file. )


I also tried a Java applet, http://www.profitcode.net/resources/tools/utf8_encoder_applet.html,
and its result is same to the one tried at 1 or 2 above. It is different from the one in the FCP project.


I will appreciate any help.

Thank you.


P.S. My whole code is here, just in case.

-----------------------------------------------------------------------------------------------------------------------------------------------

NSString *anUTF16String = [NSString stringWithString:@"자연"];
//NSString *anUTF16PercentEscapedString = [anUTF16String stringByAddingPercentEscapesUsingEncoding:NSUTF16StringEncoding];
char *UTF16CString = (char *)[anUTF16String cStringUsingEncoding:NSUTF16StringEncoding];


// 1. Making an NSString object with a UTF8 encoding
NSString *anUTF8String = [NSString stringWithUTF8String:"자연"];
NSString *anUTF8PercentEscapedString = [anUTF8String stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSString *revertedUTF8String = [anUTF8PercentEscapedString stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
char *revertedCStringOne = (char *)[revertedUTF8String cStringUsingEncoding:NSUTF8StringEncoding];


NSLog(@"Unicode 16 : %@", anUTF16String );
//NSLog(@"Unicode 16 Percent Escaped : %@", anUTF16PercentEscapedString );

NSLog(@"Unicode 8 : %@", anUTF8String );
NSLog(@"Unicode 8 Percent Escaped : %@", anUTF8PercentEscapedString );
NSLog(@"Reverted from Unicode 8 Percent Escaped : %@", revertedUTF8String );
NSLog(@"bytes : %s", revertedCStringOne );

// 2. The data : EC 9E 90 EC 97 B0
int length = strlen( revertedCStringOne );
int i;
for( i = 0; i < length; i++ )
{
NSLog(@"%X", revertedCStringOne[i] );
}
printf("\n");

// 3. Data from a Final Cut Pro XML project file which is same to "자연"
// This looks very different from what you can see from // 2.
char test[] ={ 0xE1, 0x84, 0x8C, 0xE1, 0x85, 0xA1, 0xE1, 0x84, 0x8B, 0xE1, 0x85, 0xA7, 0xE1, 0x86, 0xAB, 0};
length = strlen( test );
for( i = 0; i < length; i++ )
{
NSLog(@"%X", test[i] );
}
printf("\n");


// 4. It prints the same "자연"
NSString *questionedString = [NSString stringWithUTF8String:test];
NSLog(@"Questioned String = %@", questionedString );

// 5. Percent Escape representation of it is same to that of //3 not //2
NSString *questionedPercentEscapedString = [questionedString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSLog(@"%@", questionedPercentEscapedString);






_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
      • From: Ken Thomases <email@hidden>
  • Prev by Date: Re: Working if Cocoa Core Data
  • Next by Date: Re: Trying to get hang of PDO
  • Previous by thread: Re: Trying to get hang of PDO
  • Next by thread: Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
  • Index(es):
    • Date
    • Thread