Re: A Better Regex
Re: A Better Regex
- Subject: Re: A Better Regex
- From: T&B <email@hidden>
- Date: Sat, 18 Aug 2007 17:20:52 +1000
Hi Emmanuel,
Did you try the regexp implementation included in Smile? It should
satisfy all your regexp needs.
Does the Satimage OSAX facilitate "non greedy" (eg .*? ) matching and
special escape characters such as \s (for white space characters)?
I tried using Satimage's "find text" command to replace the following
AppleScript subroutine "GetRegexMatches", which uses do shell and perl
(inefficiently, no doubt), but Satimage doesn't seem to work with some
standard Regexp syntax. Any clues as to how I could get the
GetRegexMatchesOSAX routine below to work the same as the
GetRegexMatches routine, or at least how to get Satimage's "find text"
to handle the full PCRE syntax?
Thanks,
Tom
on run -- just a test
set regexString to ".*?(cd).*?(g.*?k).*?first\\s+([a-z]*)"
set searchInString to "abcdefghijklm is the first half of the alphabet"
GetRegexMatches of regexString into searchInString given
modifiersString:"gis"
end run
-- test gives: {"cd", "ghijk", "half"}
on GetRegexMatches of regexString into searchInString given
modifiersString:modifiersString
if regexString is "" then
-- match all
set resultList to {searchInString}
else if regexString is null or searchInString is null then
set resultList to null
else
set arraySeparator to ascii number 1 -- arbitrary delimiter
-- need to do backslashing in perl script for better speed, but
for now:
set searchInString to (ReplaceText of searchInString by "\\\\"
instead of "\\")
set searchInString to (ReplaceText of searchInString by "\\'"
instead of "'")
set regexString to (ReplaceText of regexString by "\\'" instead
of "'")
set regexString to (ReplaceText of regexString by "\\/" instead
of "/")
set perlScript to "
$searchInString = '" & searchInString & "';
$regexString = '" & regexString & "';
$arraySeparator = '" & arraySeparator & "';
@regexResults = $searchInString =~ /$regexString/" & modifiersString
& ";" & "
print join( $arraySeparator, @regexResults );"
set perlResult to DoPerlScript(perlScript)
if perlResult is "" then
set resultList to {}
else
set text item delimiters to arraySeparator
set resultList to text items in perlResult
end if
end if
return resultList
end GetRegexMatches
on ReplaceText of textBlock instead of oldString by newString
if textBlock contains oldString then
set oldDelimiters to AppleScript's text item delimiters
set AppleScript's text item delimiters to oldString
set parsedList to text items of textBlock
set AppleScript's text item delimiters to newString
set textBlock to parsedList as text
set AppleScript's text item delimiters to oldDelimiters
end if
return textBlock
end ReplaceText
on DoPerlScript(perlScript)
set perlHeader to "#!/usr/bin/perl
"
set shellScript to "perl -e " & quoted form of (perlHeader &
perlScript)
try
set perlResult to do shell script shellScript
on error errorMessage
error "Perl error:" & errorMessage
end try
return perlResult
end DoPerlScript
-- Attempted replacement (incomplete) using Satimage's "find text"
command:
on GetRegexMatchesOSAX of regexString into searchInString given
modifiersString:modifiersString
if regexString is "" then
-- match all
set resultList to {searchInString}
else if regexString is null or searchInString is null then
set resultList to null
else
set caseIsSensitive to (modifiersString does not contain "i")
set allOccurrences to (modifiersString does not contain "g")
try
set resultList to find text regexString in searchInString all
occurrences allOccurrences case sensitive caseIsSensitive with regexp
and string result
on error errorMessage
set resultList to {}
end try
end if
return resultList
end GetRegexMatchesOSAX
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden