Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
- Subject: Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
- From: Andrew Thompson <email@hidden>
- Date: Wed, 31 Mar 2004 08:58:56 -0500
On Mar 30, 2004, at 5:37 PM, Marco Scheurer wrote:
On Mar 30, 2004, at 11:49 PM, Jim Rankin wrote:
On Mar 29, 2004, at 7:57 PM, Shawn Erickson wrote:
You just need to read things in correctly using the correct encoding
(an encoding matching the source file).
Here's the one piece of the whole encoding puzzle that I've never been
able to figure out.
Your program is handled a file path or url that's ostensibly a text
file. From that point, how do you know the encoding of what you've
just been handed?
Probably missing something obvious...
No, you just can't without more information. That's why in TextEdit
you can manually specify the encoding to use to open a file. That
being said there are heuristics to guess a text encoding.
In one program I've simply used trial and error, using NSString's
-initWithData:encoding: (which returns nil if it fails to create a
valid string with the supplied data) and a small set of likely
encodings with good results.
Or google for sniffer and encoding.
Mozilla has a fairly good built in sniffer (View->Character
Coding->Auto-detect->Universal).
It took them a while to develop and of course its not perfect. There's
the problem that it assumes HTTP where you may get a char encoding
header to go one. Of course, that's a mixed blessing because many
servers are mis-configured and serve the wrong one!
AndyT (lordpixel - the cat who walks through walls)
A little bigger on the inside
(see you later space cowboy ...)
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.