• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Help with find text command
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with find text command


  • Subject: Re: Help with find text command
  • From: has <email@hidden>
  • Date: Thu, 2 Aug 2007 18:28:08 +0100

I wrote:

-- find 13-character substrings that may be an ISBN
set possMatches to find text "\\<[[:digit:]][[:digit:]-]{11} [[:digit:]X]\\>" in theText with regexp and all occurrences

Additional testing uncovers a subtle problem with this pattern - the word boundary patterns (\< and \>) consider hyphens as boundaries, so something like "979-0-123-45678-X" would match as "0-123-45678-X" which you don't want it to.


If you can switch to a more powerful regexp command that supports lookbehind and lookahead assertions, I think the following Perl- compatible regexp will work as intended. As a bonus, it checks both length and structure so will provide the single-pass solution you originally wanted:

(?<![\w-])(?=[0-9X-]{13}(?![\w-]))([0-9]{1,5}-[0-9]{1,7}-[0-9]{1,7}- [0-9X])

(Caveat emptor; do your own tests to make sure I've not missed anything.) Basically it uses negative lookbehind and negative lookahead assertions to check for a potential ISBN's beginning and end, and a positive lookahead insertion to check for the correct length inbetween. If all that matches, it then checks for a valid ISBN structure.

You could use TextCommands' search command for this, but if you do be aware that the 'finding match indexes' option currently has an off-by- one bug in the indexes returned [1] and remember to compensate for that. Alternatively, Smile's 'ufind text' command may be powerful enough for the job (Emmanuel can advise here) or you can always call out to Perl or Python (remembering to compensate for their 0- indexing, of course).

HTH

has

[1] AppleScript uses 1-indexing while Python uses 0-indexing, and I forgot to adjust the numbers accordingly. I'll fix this for the next release.
--
http://appscript.sourceforge.net
http://rb-appscript.rubyforge.org
http://appscript.sourceforge.net/objc-appscript.html


_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden
  • Follow-Ups:
    • Re: Help with find text command
      • From: Philip Aker <email@hidden>
    • Re: Help with find text command
      • From: "Wallace, William" <email@hidden>
References: 
 >Re: Help with find text command (From: has <email@hidden>)
 >Re: Help with find text command (From: has <email@hidden>)

  • Prev by Date: Re: Fwd: Network volume considerations when working with files?
  • Next by Date: Re: Help with find text command
  • Previous by thread: Re: Help with find text command
  • Next by thread: Re: Help with find text command
  • Index(es):
    • Date
    • Thread