Re: [Q] What encoding method can be automatically detected?
Re: [Q] What encoding method can be automatically detected?
- Subject: Re: [Q] What encoding method can be automatically detected?
- From: JongAm Park <email@hidden>
- Date: Thu, 01 Feb 2007 08:56:12 -0800
Ricky Sharp wrote:
The docs on that API mention that encodings are only returned when the file contains BOMs. Now, then UTF-8 BOMS are optional, so that older method would probably not be able to detect UTF-8 files lacking BOMs.
Is it documented? I read
http://developer.apple.com/documentation/Cocoa/Conceptual/Strings/Articles/readingFiles.html#//apple_ref/doc/uid/TP40003459,
and
http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html#//apple_ref/occ/clm/NSString/stringWithContentsOfFile:usedEncoding:error:,
but I don't see it....
Anyway...
Definitely try to migrate your users to using UTF-8 for their documents; makes life so much easier.
Yes.. but my app is for using the encoding method other than Unicode.
However, thank you for your explanation. It was very helpful.
At home, I made my iMac compile the Firefox. Within the firefox source
code, there is a chardet library which detects character encoding used
for a given string. Because it depends on statistics, it would give back
more accurate result if the string is long enough.
Because I work with some text files, it should be OK.
Also, I found IBM's ICS library. It contains a character encoding
detection library, but it is implemented in Java.
It seems to be easier to use.
So, I guess it would be better to use one of those libraries, instead of
the NSString's method....
Thank you.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden