Re: Regular Expressions?
Re: Regular Expressions?
- Subject: Re: Regular Expressions?
- From: Jens Alfke <email@hidden>
- Date: Fri, 6 Jun 2008 08:03:00 -0700
On 6 Jun '08, at 3:23 AM, Jason Stephenson wrote:
As a long time UNIX programmer, I'll suggest looking into the regexp
library that already comes with OS X.
man regcomp on the command line to find out how to use.
It doesn't look as though this library is Unicode-aware. The strings
it takes are C string (char*) with no indication of what encoding is
used, and Unicode or UTF-8 aren't mentioned in the man page. From
that, I'd guess that this library only works with single-byte
encodings (like ISO-Latin-1 or CP-1252, not UTF-8 or the various non-
Roman encodings) and that it will treat all non-ascii characters as
being not spaces and not letters.
In short, I think it only works correctly with plain ascii. IMHO
that's much too limited for most purposes nowadays. Even if you don't
touch user-visible text with it, it's still pretty common to find non-
ascii characters in HTML, XML, even source code.
Of the regex libraries mentioned so far, I recommend RegexKitLite.
It's based on ICU, which is Unicode-savvy, already built into the OS,
and used by lots of Apple apps.
—Jens
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden