Re: Strange character transformation
Re: Strange character transformation
- Subject: Re: Strange character transformation
- From: Ali Ozer <email@hidden>
- Date: Mon, 2 Feb 2004 16:09:24 -0800
I read an RTF file into an NSString with stringWithContentsOfFile.
The RTF contains a French capital E with acute accent. When I check
the RTF file, this character is represented 0xc9 - which is correct
unicode for the character.
When imported into the string the 0xc9 has become 0x2026 - which is
unicode for elipses ( ...)
When I write this string to an NSTextView it appears correctly on
screen as E acute.
When however the character 0x2026 is displayed in an NSTableView cell
it appears as elipses.
Any one able to point me in the right direction ?
When you load a file with stringWithContentsOfFile:, you are using the
"default system encoding" for C strings, which is very likely MacRoman
(assuming you're running in English or a Western European language).
0xC9 is ellipses in MacRoman. (0xC9 is E-acute in both Unicode and
Windows; I imagine the latter is the reason this file uses 0xC9 to
represent that character.)
In an RTF file, there might be bytes representing characters in
different encodings, and only the fonts or other encoding tags can tell
you what the actual Unicode characters are. Interpreting the whole file
in one single encoding (as stringWithContentsOfFile: does) is not a
valid approach.
Two alternatives are to load the RTF file using the RTF methods in
AppKit/NSAttributedString.h, or load the RTF file as pure data, using
NSData initWithContentsOfFile:. The first will do all the work, and
if all you wanted was the text, you can easily get it from the
attributed string. The second will give you everything in the RTF and
let you parse it...
Ali
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.