Re: How to convert UInt8 array to NSString
Re: How to convert UInt8 array to NSString
- Subject: Re: How to convert UInt8 array to NSString
- From: "Clark Cox" <email@hidden>
- Date: Tue, 6 May 2008 10:56:55 -0700
On Tue, May 6, 2008 at 10:45 AM, Aki Inoue <email@hidden> wrote:
>
> On 2008/05/06, at 8:56, Jens Alfke wrote:
>
>
> >
> > On 6 May '08, at 7:03 AM, Thomas Engelmeier wrote:
> >
> >
> > > As the OP wants to create NSStrings with data created by his application
> I'm pretty sure he will not want the the Windows encoding - unless he parses
> text documents originating from Windows.
> > >
> >
> > He didn't say where the data originates from, or what those APIs are that
> return the strings. If they're networking APIs, the data could very likely
> have originated on Windows.
> >
> > Also, you missed my point about using CP1252 (WinLatin1). It's useful as a
> fallback for any unknown C strings because (a) it's a superset of
> ISO-Latin-1, which (b) has no gaps in it (as ISO does, from 0x80-0x9F), so
> decoding text into an NSString will never fail and return nil. (I've
> debugged several crashes that stemmed from nil NSStrings decoded from
> garbage strings.)
> >
> Jens,
>
> Actually, I don't recommend using CP1252 as the generic fallback encoding
> like this.
>
> The encoding does have gaps, and the handling of those invalid gaps varies
> between conversion engines. CF/NSString treat the invalid bytes strictly
> and return nil encountering those.
>
> Also, being compatible with ISO Latin1 (aka ISO 8859-1) is becoming less
> compelling reasons in the Net since the overall percentage of the encoding
> (both ISO 8859-1 and cp1252 combined) is declining.
Not just declining, completely overtaken by UTF-8:
<http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html>
>
>
>
> >
> > > If the bytes come from MacOS text files he may want to use the MacRoman
> encoding, otherwise creating UTF8 and passing around NSStrings will be the
> way to go - especially in Europe where all that äöüñá goodies exist.
> > >
> >
> > For the most part only old (pre-OS X) files would still be using MacRoman.
> Current Mac apps generally default to UTF-8.
> >
> So, our recommendation now is to try UTF-8 first; then, try some other
> encoding deduced from the context (user's localization, intended
> source/destination of the data, etc). If all failed, should try MacRoman as
> the ultimate fallback (the encoding has no gap so never fails).
--
Clark S. Cox III
email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden