• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Help with find text command
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with find text command


  • Subject: Re: Help with find text command
  • From: Philip Aker <email@hidden>
  • Date: Wed, 01 Aug 2007 12:24:06 -0700

On 2007-08-01, at 10:32:30, Wallace, William wrote:

[…]
Seems to work fine up to a point. However, it occurred to me that the regexp could match this string: "0-0-0-0". Which is not at all what I want. I'm looking for 10 digit ISBNs in the block of text (which should always be 13 characters--10 digits divided into 4 substrings by 3 hyphens). Is there a way that I can maintain the flexibility in the number of digits within each substring, but insist that the total number of characters in the matched string remain constant at 13?

I schlipped two ISBNs into some of the text from your email for a test. In this form, the Tcl regexp will return a space-separated list of the hyphenated digits. There are other options such as returning offsets but I think returning the actual found items would be best. You could probably grab the regexp inside the braces to use with most other languages but I can't say how they would deal with the -inline and -all options (which are very effective for this kind of search).


set t to "I'm using the find text command from satimage.osax to search a block of text
to find a string that fits a pattern defined as a regular _expression_. I have
the basic regexp ISBN: 05-961-8253-7 working but I'm looking to refine it a little and, being a
regexp newb, I'm wondering if what I want to do is even possible. The
string(s) I'm looking for are in the following format:

[1-5 digits][hyphen][1-7 digits][hyphen][1-7 digits][hyphen][1 digit (which
may actually be an \"X\")]

This is the command that I have so far to match this:

--
find text
\"[[:digit:]]{1,5}-[[:digit:]]{1,7}-[[:digit:]]{1,7}-[[:digit:]X]{1}\" in
theText with regexp and all occurrences
--

Seems to work fine up to a point. However, it occurred to me that the regexp
could match this string: \"0-0-0-0\". Which is not at all what I want. I'm
looking for 10 digit ISBNs in the block of text (which should always be 13
characters--10 digits divided ISBN: 0-596-00053-7 into 4 substrings by 3 hyphens). Is there a
way that I can maintain the flexibility in the number of digits within each
substring, but insist that the total number of characters in the matched
string remain constant at 13?"

do shell script "tclsh <<< 'puts [regexp -inline -all -- {[[:digit:]X-]{13}} {'" & quoted form of t & "}]"

--> "05-961-8253-7 0-596-00053-7"

Philip Aker
email@hidden


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Help with find text command
      • From: "Wallace, William" <email@hidden>
References: 
 >Help with find text command (From: "Wallace, William" <email@hidden>)

  • Prev by Date: Re: Help with find text command
  • Next by Date: Re: Akua Sweets
  • Previous by thread: Re: Help with find text command
  • Next by thread: Re: Help with find text command
  • Index(es):
    • Date
    • Thread