Re: Regular Expressions?
Re: Regular Expressions?
- Subject: Re: Regular Expressions?
- From: glenn andreas <email@hidden>
- Date: Fri, 06 Jun 2008 10:13:10 -0500
On Jun 6, 2008, at 5:23 AM, Jason Stephenson wrote:
Hi,
You've gotten a lot of decent answers so far.
As a long time UNIX programmer, I'll suggest looking into the regexp
library that already comes with OS X.
man regcomp on the command line to find out how to use.
Note that NSStrings are usually internally stored as UTF-16, and
regcomp requires a "char *", so at the very least, you'll need to
convert the NSString to UTF-8, which can be expensive (in terms of
having to make a large copy of a potentially very large string and
walk through before doing any regex work on it).
Worse, once converted to UTF8, it's not documented that regcomp works
correctly for any UTF-8 other than ASCII.
Even worse, converting from an index in a UTF-8 string back to the
corresponding index in the original NSString is also problematic - you
basically have to walk through the UTF-8 string, counting code points
(which count double for surrogate pairs).
As a result, using regcomp works OK for shorter strings that are pure
ASCII to start with, but longer string or non-ASCII characters start
to increase the problem...
One other possible solution is to use the JavaScriptCore and make a
JSStringRef (which works with unichars like NSString), and use
JavaScript's regex support - that way the results will at least have
consistent indices, work well with non-ASCII characters, etc...
Glenn Andreas email@hidden
<http://www.gandreas.com/> wicked fun!
JSKit | the easy way to unite JavaScript and Objective C
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden