• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?


  • Subject: Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
  • From: Clark Cox <email@hidden>
  • Date: Thu, 20 Jan 2005 15:15:04 -0500

On Thu, 20 Jan 2005 11:42:00 -0800, Andrew Farmer <email@hidden> wrote:
> On 20 Jan 2005, at 09:26, Stephane Sudre wrote:
> > In some e-mail subjects, people are using what is supposed to be UTF-8
> > encoded and is actually poor Unicode encoded.
> >
> > For instance, instead of 0xC3A9 for eacute, you end up with 0xE9
> > (where it should be 0x00E9).
> >
> > When you use NSString initWithBytes:length:encoding with the UTF-8
> > encoding as the paramter, you obtain nil. I understand this.
> >
> > Now, the question is: is there a method in Cocoa to deal with stupidly
> > encoded UTF-8 string?
>
> What you're looking at is ISO8859-1 encoded text. Decode it as such and
> you'll be fine.
>
> I'm pretty sure that there *should* be some easy way to detect whether
> text in the subject is encoded with ISO8859-1 or UTF-8. Look up the
> standards (if they exist).

While you can make some educated guesses, there is no foolproof way to
conclusively determine if text is UTF-8 vs. ISO-8859-1. The best guess
that you can make is already made by NSString for you: It couldn't
convert the text and returned nil.

When you receive that nil, you have to guess at the format. If your
data is likely to be coming from Western Europeans or Americans, then
ISO-88599-1 is probably a good backup guess.

--
Clark S. Cox III
email@hidden
http://www.livejournal.com/users/clarkcox3/
http://homepage.mac.com/clarkcox3/
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >Is there any support in Cocoa for stupidly encoded UTF-8 string? (From: Stephane Sudre <email@hidden>)
 >Re: Is there any support in Cocoa for stupidly encoded UTF-8 string? (From: Andrew Farmer <email@hidden>)

  • Prev by Date: Re: Authorization without permanent setuid on helper
  • Next by Date: Re: Why do "loose" nibs take precedence over nibs in .lproj?
  • Previous by thread: Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
  • Next by thread: Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
  • Index(es):
    • Date
    • Thread