[ANN] RegexKitLite 3.0 - Release Canidate, feedback wanted
[ANN] RegexKitLite 3.0 - Release Canidate, feedback wanted
- Subject: [ANN] RegexKitLite 3.0 - Release Canidate, feedback wanted
- From: John Engelhart <email@hidden>
- Date: Thu, 30 Apr 2009 04:04:40 -0400
All,
RegexKitLite 3.0 development is wrapping up. I'm looking for feedback
before I freeze things in an actual release, particularly from current
RegexKitLite users. Because it's not released yet, you'll need to grab it
via svn. You can do so from the shell with 'svn co
http://regexkit.svn.sourceforge.net/svnroot/regexkit/RegexKitLite'. I'm
currently finishing up the documentation, and areas that are 'under
construction' have a red border around them.
Highlights of what's new in 3.0:
As always, bug fixes. :)
New methods added-
- (NSArray *)arrayOfCaptureComponentsMatchedByRegex:(NSString *)regex;
- (NSArray *)captureComponentsMatchedByRegex:(NSString *)regex;
- (NSInteger)captureCount;
- (NSArray *)componentsMatchedByRegex:(NSString *)regex;
- (void)flushCachedRegexData;
- (BOOL)isRegexValid;
Probably the most interesting are the methods that return a NSArray.
componentsMatchedByRegex: essentially replaces the functionality that was
previously available via RKLMatchEnumerator. For example, to extract every
match by a regular expression from a string, you can use:
NSArray *matchesArray [searchString componentsMatchedByRegex:@
"\\b(https?)://([a-zA-Z0-9\\-.]+)((?:/[a-zA-Z0-9\\-._?,'+\\&%$=~*!():@\\\\]*)+)"];
This creates a NSArray of all the http url's in a string. Since the result
is a NSArray, you can use ObjC2's for...in feature to iterate over all the
matches.
The method arrayOfCaptureComponentsMatchedByRegex: also returns all the
matches in a string, but each match result is a NSArray which contains the
strings matched by all the capture groups in a regular expression. Building
on the previous example:
searchString = @"Visit http://www.cocoadev.com/index.pl?RecentChanges for
more information";
for(NSArray *captureArray
in [searchString
arrayOfCaptureComponentsMatchedByRegex:@"\\b(https?)://([a-zA-Z0-9\\-.]+)((?:/[a-zA-Z0-9\\-._?,'+\\&%$=~*!():@\\\\]*)+)"])
{
// Process each result..
// captureArray == [NSArray arrayWithObjects:@"
http://www.cocoadev.com/index.pl?RecentChanges", @"http", @"www.cocoadev.com",
@"/index.pl?RecentChanges", NULL];
}
The string at a given captureArray index corresponds to the regular
expression capture group, with 0 being the entire match.
Also included is support for RegexKitLite DTrace provider probe points. The
two probe points, compiledRegexCache and utf16ConversionCache, are meant to
provide visibility in to the effectiveness of the caches. While some regex
performance problems can be due to the nature of the regex being executed,
these probe points can give you insight in to whether or not part of the
performance problem is due to thrashing of the caches. If that's the case,
you can probably restructure the problem in a way to minimize the thrashing,
and then test the changes. These probe points can give you hard data to
make informed choices.
Another feature added is the pre-processor tunable
'RKL_APPEND_TO_ICU_FUNCTIONS'. This is particular useful if you do not want
to link to the Apple provided /usr/lib/libicucore.dylib. Custom build ICU
libraries will typically have the version appended to the ICU function, such
as "_3_6". This allows you to easily target such a library. Building your
own, custom ICU library to link against is not for the feint of heart,
though. RegexKitLite doesn't include instructions for doing so, but this
should make it easy to use RegexKitLite with such a library if you need to
do so. A custom build of ICU can easily weigh in at 10 (ten) megabytes.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden