• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)


  • Subject: Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
  • From: Ralph Pöllath <email@hidden>
  • Date: Wed, 31 Mar 2004 16:24:35 +0200

On 31.03.2004, at 15:58, Andrew Thompson wrote:

On Mar 30, 2004, at 5:37 PM, Marco Scheurer wrote:

On Mar 30, 2004, at 11:49 PM, Jim Rankin wrote:

On Mar 29, 2004, at 7:57 PM, Shawn Erickson wrote:

You just need to read things in correctly using the correct encoding
(an encoding matching the source file).

Here's the one piece of the whole encoding puzzle that I've never been
able to figure out.

Your program is handled a file path or url that's ostensibly a text
file. From that point, how do you know the encoding of what you've
just been handed?

Probably missing something obvious...

No, you just can't without more information. That's why in TextEdit you can manually specify the encoding to use to open a file. That being said there are heuristics to guess a text encoding.

In one program I've simply used trial and error, using NSString's -initWithData:encoding: (which returns nil if it fails to create a valid string with the supplied data) and a small set of likely encodings with good results.

Or google for sniffer and encoding.

Mozilla has a fairly good built in sniffer (View->Character Coding->Auto-detect->Universal).
It took them a while to develop and of course its not perfect. There's the problem that it assumes HTTP where you may get a char encoding header to go one. Of course, that's a mixed blessing because many servers are mis-configured and serve the wrong one!

http://developer.apple.com/documentation/Carbon/Reference/ Text_Encodin_sion_Manager/tec_refchap/function_group_11.html

looks like Carbon supports encoding sniffers.

Cheers,
-Ralph.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.


  • Follow-Ups:
    • Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
      • From: Jim Rankin <email@hidden>
References: 
 >Converting ASCII to UTF-8? (From: "Huyler, Christopher M" <email@hidden>)
 >Re: Converting ASCII to UTF-8? (From: Shawn Erickson <email@hidden>)
 >Tao of string encodings (Re: Converting ASCII to UTF-8?) (From: Jim Rankin <email@hidden>)
 >Re: Tao of string encodings (Re: Converting ASCII to UTF-8?) (From: Marco Scheurer <email@hidden>)
 >Re: Tao of string encodings (Re: Converting ASCII to UTF-8?) (From: Andrew Thompson <email@hidden>)

  • Prev by Date: Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
  • Next by Date: Re: printing a stack trace to the console
  • Previous by thread: Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
  • Next by thread: Re: Tao of string encodings (Re: Converting ASCII to UTF-8?)
  • Index(es):
    • Date
    • Thread