Re: String Encoding Detection (Revisited)
Re: String Encoding Detection (Revisited)
- Subject: Re: String Encoding Detection (Revisited)
- From: Francisco Tolmasky <email@hidden>
- Date: Thu, 7 Aug 2003 20:39:53 -0700
How do I determine if the data is in beg endian or little endian? Or
is just check for both FEFF and FFFE enough? Also, is there no b/l e
difference in the utf-8 mode? (Do I check for any of "EF BB BF, or for
all of those one after the other?)
On Thursday, August 7, 2003, at 10:11 AM, Dustin Voss wrote:
On Thursday, August 7, 2003, at 01:44 AM, Francisco Tolmasky wrote:
Ok, so I recently posted a question about auto-detecting string
encodings, and also looked through the archives. Basically there's
no way unless it is unicode and has a BOM. I still want an
auto-detect feature though, like BBEdit's. So basically, how do I
check for a BOM (I check TextEdit's code, couldn't find it, found
lots of other stuff though). Anyways, other than that and doing some
weird spell checking thing someone suggested (Using spellchecker to
see if the string makes sense or not, which would be pretty useless
if it's code or anything other than pure sentences), are there any
other "tricks"?
And when all else fails and I resort to just using an encoding, which
one should I choose mac os roman, ascii, utf-8?
I don't know about tricks, but the BOM will be one of the following:
UTF-16 BE: FE FF
UTF-16 LE: FF FE
UTF-8: EF BB BF
You could using Carbon's Text Encoding Converter. It supports
something called a "sniffer" that analyzes text and tries to determine
the likeliest encoding. It looks pretty powerful.
Conceptual information is at
<http://developer.apple.com/documentation/Carbon/Conceptual/
ProgWithTECM/index.html>, but it does not discuss sniffers.
The API reference is at
<http://developer.apple.com/documentation/Carbon/Reference/
Text_Encodin_sion_Manager/index.html>.
Francisco Tolmasky
email@hidden
http://users.adelphia.net/~ftolmasky
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.