Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
- Subject: Re: Is there any support in Cocoa for stupidly encoded UTF-8 string?
- From: Bob Ippolito <email@hidden>
- Date: Thu, 20 Jan 2005 16:30:11 -0500
On Jan 20, 2005, at 4:23 PM, email@hidden wrote:
On Jan 20, 2005, at 11:42 AM, Andrew Farmer wrote:
What you're looking at is ISO8859-1 encoded text. Decode it as such
and you'll be fine.
I'm pretty sure that there *should* be some easy way to detect
whether text in the subject is encoded with ISO8859-1 or UTF-8. Look
up the standards (if they exist).
You would think so, but there isn't. UTF-8 uses multi-byte sequences
to encode characters beyond ASCII. It signifies that a character is
multi-byte by setting the high bit of the first character.
What about TECM?
http://developer.apple.com/documentation/Carbon/Conceptual/ProgWithTECM/
http://developer.apple.com/documentation/Carbon/Reference/
Text_Encodin_sion_Manager/
I think it knows how to guess better than you or I do :)
I wrapped some of this in Python a while ago
<http://undefined.org/python/#TECManager>.
-bob
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden