Re: How to detect string encoding before reading a file in NSString?
Re: How to detect string encoding before reading a file in NSString?
- Subject: Re: How to detect string encoding before reading a file in NSString?
- From: Laurent Daudelin <email@hidden>
- Date: Wed, 27 Apr 2011 12:08:38 -0700
On Apr 26, 2011, at 11:43, Nick Zitzmann wrote:
> On Apr 26, 2011, at 12:13 PM, Laurent Daudelin wrote:
>
>> I've found different ways to do that (some pure Cocoa, some using Carbon) but I was wondering about the wisdom of this list as to what is the best way to detect the encoding of a file before passing it to NSString initWithContentsOfFile:encoding:error:?
>
> TextEdit's encoding guesser just uses the built-in NSAttributedString method -initWithURL:options:documentAttributes:error:, which will guess the file's encoding when opening it. But it has been mentioned that heuristics are not infallible, and this method's heuristics are no exception. It does a good job overall, but I've found that it usually misinterprets UTF-8 format text.
I finally got around building a little test program and I tried a few files that were sent to me by a coworker from a Windows machine compressed into a zip archive. I did unarchive the files and fed them to my test program. All of them failed mightily. Just a bunch symbols showing clearly that the framework cannot guess the encoding. Each file had a specific encoding but each time, it guessed the encoding as NSMacOSRomanStringEncoding which is plainly wrong. I tried opening the files in TextEdit and they show up the same way, wrong encoding.
Looks like I'll have to look for other suggested alternatives….
-Laurent.
--
Laurent Daudelin
AIM/iChat/Skype:LaurentDaudelin http://www.nemesys-soft.com/
Logiciels Nemesys Software email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden