Re: Help with find text command
Re: Help with find text command
- Subject: Re: Help with find text command
- From: has <email@hidden>
- Date: Thu, 2 Aug 2007 18:28:08 +0100
I wrote:
-- find 13-character substrings that may be an ISBN
set possMatches to find text "\\<[[:digit:]][[:digit:]-]{11}
[[:digit:]X]\\>" in theText with regexp and all occurrences
Additional testing uncovers a subtle problem with this pattern - the
word boundary patterns (\< and \>) consider hyphens as boundaries, so
something like "979-0-123-45678-X" would match as "0-123-45678-X"
which you don't want it to.
If you can switch to a more powerful regexp command that supports
lookbehind and lookahead assertions, I think the following Perl-
compatible regexp will work as intended. As a bonus, it checks both
length and structure so will provide the single-pass solution you
originally wanted:
(?<![\w-])(?=[0-9X-]{13}(?![\w-]))([0-9]{1,5}-[0-9]{1,7}-[0-9]{1,7}-
[0-9X])
(Caveat emptor; do your own tests to make sure I've not missed
anything.) Basically it uses negative lookbehind and negative
lookahead assertions to check for a potential ISBN's beginning and
end, and a positive lookahead insertion to check for the correct
length inbetween. If all that matches, it then checks for a valid
ISBN structure.
You could use TextCommands' search command for this, but if you do be
aware that the 'finding match indexes' option currently has an off-by-
one bug in the indexes returned [1] and remember to compensate for
that. Alternatively, Smile's 'ufind text' command may be powerful
enough for the job (Emmanuel can advise here) or you can always call
out to Perl or Python (remembering to compensate for their 0-
indexing, of course).
HTH
has
[1] AppleScript uses 1-indexing while Python uses 0-indexing, and I
forgot to adjust the numbers accordingly. I'll fix this for the next
release.
--
http://appscript.sourceforge.net
http://rb-appscript.rubyforge.org
http://appscript.sourceforge.net/objc-appscript.html
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden