• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: converting unicode text representation to unichar
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: converting unicode text representation to unichar


  • Subject: Re: converting unicode text representation to unichar
  • From: Daniel Child <email@hidden>
  • Date: Tue, 11 Aug 2009 13:30:59 -0400

Thank you. I think I'm almost there, though I'm getting incorrect values. I tried both %c and %C. These yield, respectively, î and  , which are incorrect.

2009-08-11 13:24:58.879 ParseTest[566:10b] The num is U+4E01
2009-08-11 13:24:58.889 ParseTest[566:10b] codeItself is 4E01
2009-08-11 13:24:58.891 ParseTest[566:10b] charAsString is 19969
2009-08-11 13:24:58.892 ParseTest[566:10b] strc is î
2009-08-11 13:24:58.893 ParseTest[566:10b] strC is 

UnicodeRecordParsingStrat *urps = [[UnicodeRecordParsingStrat alloc] init];
theUniWord = [urps parseUnicodeWord: unicodeLine]; // yields @"U+4E01"
codeItself = [urps theCharacterFromCode: theUniWord]; // yields @"4E01"
NSString *ox = @"0x";
NSString *hexString;
hexString = [ox stringByAppendingString: codeItself]; // yields @"0x4E01"
NSScanner *scanner = [NSScanner scannerWithString: hexString];
NSString *charAsString;
unsigned value;
if ([scanner scanHexInt:&value]){
charAsString = [NSString stringWithFormat: @"%u", value]; // yields 19969
NSLog(@"charAsString is %@\n", charAsString);
NSString *strc = [NSString stringWithFormat: @"%c", &value];
NSLog(@"strc is %@\n", strc);
NSString *strC = [NSString stringWithFormat: @"%C", &value];
NSLog(@"strC is %@\n", strC);
} else {
NSLog( @"Hex reading failed." );
}


Seems like a lot of code for a simple conversion.....

On Aug 11, 2009, at 11:12 AM, Alastair Houghton wrote:

On 11 Aug 2009, at 15:40, Daniel Child wrote:

Unihan.txt provides text files showing characters in the format U +XXXX.
If I scan these in, naturally I can obtain the NSString representation XXXX.
But I need to convert this text to genuine unichars OR NSStrings (the actual characters represented).


Two questions:

1. I didn't see any relevant conversion methods under NSString or NSNumber. Are there Cocoa functions to perform this easily?

Use NSScanner's -scanHexInt (or similar) to scan the hexadecimal part, then stick that in a unichar and either use - stringWithFormat's %C format code, or NSString's - initWithCharacters:length:/+stringWithCharacters:length: methods.


Or you can just scan the hex part yourself manually; it isn't hard.

You could even turn it into UTF-8 and use strtoul() if you were feeling mildly masochistic.

2. I am assuming I have to convert to the format '0xXXXX', but is it also possible to work with U+XXXX directly in cocoa? I got error messages for all of the following formats:

unichar uch = 0x0041; NSString *str = [NSString stringWithCharacters:&uch length:1]; NSLog (@"str is \"%@\".", str);

 /* Output:

    str is "A" */

Kind regards,

Alastair.

--
http://alastairs-place.net




_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: converting unicode text representation to unichar
      • From: Daniel Child <email@hidden>
References: 
 >converting unicode text representation to unichar (From: Daniel Child <email@hidden>)
 >Re: converting unicode text representation to unichar (From: Alastair Houghton <email@hidden>)

  • Prev by Date: Rescheduling an NSTimer from a completion method
  • Next by Date: Re: NSUserDefault and Negative numerical arguments (Was: Posting mouse clicks with multiple displays)
  • Previous by thread: Re: converting unicode text representation to unichar
  • Next by thread: Re: converting unicode text representation to unichar
  • Index(es):
    • Date
    • Thread