Re: Determine encoding of file
Re: Determine encoding of file
- Subject: Re: Determine encoding of file
- From: Nick Zitzmann <email@hidden>
- Date: Fri, 30 Jul 2010 16:24:56 -0600
On Jul 30, 2010, at 4:09 PM, Dave DeLong wrote:
> Hi everyone,
>
> I have a seemingly simple question, but I haven't been able to figure it out.
>
> Given a file, how can I determine the NSStringEncoding of the file, without reading the entire file into memory? (If the file isn't a text file, then defaulting to NSUTF8StringEncoding is just fine, since my code will only work properly if I'm working with text files anyway)
>
> I've found this: http://www.macosxguru.net/article.php?story=20030808081801868 but it seems ridiculously complex...
Check the first two bytes. If they are 0xFEFF or 0xFFFE, then it is guaranteed to be in Unicode (UTF-16) format. Otherwise, it can be in pretty much any format, since pretty much every format that is not Unicode doesn't use identifiers of any sort.
NSAttributedString has a heuristic importer in its AppKit category that tries to guess the encoding of the file, -initWithData:options:documentAttributes:error:, but it is not perfect (in particular, it tends to mistake data in UTF-8 format for something else). So if it's not Unicode, then you're probably better off just asking the user to tell the program the encoding of the file. We ended up doing this in several of our apps.
Nick Zitzmann
<http://www.chronosnet.com/>
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden