Re: Xcode Editor's Regex now uses PCRE instead of ICU?
Re: Xcode Editor's Regex now uses PCRE instead of ICU?
- Subject: Re: Xcode Editor's Regex now uses PCRE instead of ICU?
- From: "Clark Cox" <email@hidden>
- Date: Thu, 13 Mar 2008 08:00:14 -0700
On Thu, Mar 13, 2008 at 4:01 AM, Thomas Engelmeier
<email@hidden> wrote:
>
> On 13.03.2008, at 08:37, Clark Cox wrote:
>
> > On Wed, Mar 12, 2008 at 5:40 PM, Alastair Houghton
> > <email@hidden> wrote:
> >>
>
>
> >> Yes, that's true. You can see the sources for CFString in the Darwin
> >> source tree. Furthermore, string constants (even @"" and CFSTR("")
> >> ones) are encoded in ASCII by the compiler, which makes 8-bit strings
> >> quite common in practice.
> >
> > FYI: As of Leopard, this is no longer necessarily true (i.e. the
> > string constants being ASCII). Full UTF-8 strings are now supported
> > within @"" and CFSTR("") strings, so there are cases where even these
> > strings are encoded as UTF-16 by the compiler.
>
> As of Leopard or as of Xcode 3.x?
As of Xcode 3 (or more precisely, the gcc that ships therewith).
> And, if I read the paragraph above correctly, the compiler will expand
> @"UTF-8 string" in the source code to UTF16 in the string constant in
> the TEXT section?
I'm not sure of the exact criteria, as I haven't ever felt the need to
look into it, but I have seen some UTF-8 @"" strings stored as an
8-bit encoding in the binary, and some expanded and stored as UTF-16
in the binary. Of course this is an implementation detail and
shouldn't be relied upon. My point was just to indicate that one
cannot assume that a NSString/CFString obtained from a @""/CFSTR("")
is stored in an 8-bit encoding.
--
Clark S. Cox III
email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden