• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag
 

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Help with find text command
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with find text command


  • Subject: Re: Help with find text command
  • From: Ed Stockly <email@hidden>
  • Date: Thu, 2 Aug 2007 22:10:37 -0700

>>Wallace>>I did start out with a vanilla AppleScript approach, but discovered that I had some concerns about that route (some of which are echoed in your solution below). First, I was concerned that examining every word in a repeat loop would be too slow. There is a fair amount of extraneous text that would be examined word by word unnecessarily and slow things down.

Right, my concern was that ISBN numbers are not always delimited with hyphens, but when I looked more closely at your message I realized that the numbers in your text would be properly formatted.

That made Text Item Delimiters a better, speedier route. 

>>Phil>>Here's a challenge for you Ed,

>>The script below takes some text paragraphs with 2 legal old style ISBN numbers and inserts the new style numbers (as HTML links) with the price from a trumped up conversion map and then writes the result   to an HTML file on the user's desktop. It avoids the two illegal ISBNs. Let's see your "pure AppleScript" solution. Let me know when it's done so we can compare speeds.


Phil, 

I'm not sure what the point of the HTML links are, they seem a bit superfluous, beyond what the original poster intended. Plus, if all you're interested in is measuring speed, it seems like you're arbitrarily adding tasks, perhaps to exaggerate the difference? 

Yes, Shell scripts do some things faster than AppleScript.

I don't agree that speed is the only consideration.

If you're script is only finding 2 legal ISBN numbers in that text, I think it needs some work. 

How does your script handle found ISBN numbers that don't have prices?

I pasted your script into my script editor and it produced an empty result. Is there some missing code to make it work on any mac?

Here's my version. It is not optimized for speed.  

Ed

(Mind the line breaks)

------------
property newPrefix : "978"
--the line below begins one long multi-line text string read into a variable
--set ohMyWord to read (choose file with prompt "Select text file with ISBNs hiding inside")
set OhMyWord to "I'm using the find text command from satimage.osax to search a block of text to find a string that fits a pattern defined as a regular _expression_. I have the basic regexp 
ISBN: 05-961-8253-7 working but I'm looking to refine it a little and, being a regexp newb, I'm wondering if what I want to do is even possible. The string(s) I'm looking for are in the following format:

[1-5 digits][hyphen][1-7 digits][hyphen][1-7 digits][hyphen][1 digit (which may actually be an \"X\")]

This is the command that I have so far to match this:
__
find text \"[[:digit:]]{1,5}-[[:digit:]]{1,7}-[[:digit:]]{1,7}-[[:digit:]X]{1}\" in theText with regexp and all occurrences
__
Seems to work fine up to a point. nestled within it: fsdfh123@8X452P340-07-294509-5zzzzzz999999.
However, it occurred to me that the regexp could match this string: \"0-0-0-0\". Which is not at all what I want.
I'm looking for 10 digit ISBNs in the block of text (which should always be 13 characters--10 digits divided ISBN: 0-596-00053-7z into 4 substrings by 3 hyphens). 
Is there a way that I can 0-596-00053-7 maintain the flexibility in the number of digits within each substring, but insist that the total number of characters in the matched string remain constant at 13?"

--Wow! That is one long variable!

set exportFile to choose file name with prompt "Select a file to save ISBN numbers" default name "ISBN Export.html"
set startTime to time of (current date)
set AppleScript's text item delimiters to "-"
set OhMyWord to every text item of OhMyWord
if the (count of OhMyWord) < 4 then return
set x to 2
set foundNumbers to {}
--[1-5 digits][hyphen][1-7 digits][hyphen][1-7 digits][hyphen][1 digit or x
repeat
  repeat
    set pubCode to item x of OhMyWord
    if numberTest(pubCode) then
      set pubCode to pubCode as string
      set pubCodeLength to the count of pubCode
      if pubCodeLength > 7 then exit repeat
    else
      exit repeat
    end if
    set itemNumber to (item (x + 1) of OhMyWord)
    if numberTest(itemNumber) then
      set itemNumberLength to the count of itemNumber
      if itemNumberLength > 7 then exit repeat
    else
      exit repeat
    end if
    set checkSum to character 1 of (item (x + 2) of OhMyWord) as string
    if checkSum is not in "1234567890xX" then exit repeat
    set groupIdLength to 9 - (itemNumberLength + pubCodeLength)
    set groupId to item (x - 1) of OhMyWord
    set firstSize to the count of groupId
    if firstSize < groupIdLength then exit repeat
    if firstSize > groupIdLength then
      set startChar to (groupIdLength)
      set AppleScript's text item delimiters to ""
      set groupId to (items -startChar thru -1 of groupId) as string
    end if
    if numberTest(groupId) then
      set AppleScript's text item delimiters to "-"
      set the end of foundNumbers to {newPrefix, groupId, pubCode, itemNumber, checkSum} as string
      exit repeat
    else
      exit repeat
    end if
  end repeat
  set x to x + 1
  if x > the ((count of OhMyWord) - 2) then exit repeat
end repeat
set AppleScript's text item delimiters to return
set numberPrices to {"978-05-961-8253-7 $49.99", "978-0-596-00053-7 $65.00"} as string
set numberPriceOutput to {"<html>", "<head>", "<title>New Listings</title>", "<style type='text/css'>p {font-family:Trebuchet MS;}</style>", "</head>", "<body>"}
repeat with thisNumber in foundNumbers
  set AppleScript's text item delimiters to thisNumber
  try
    set thisPrice to paragraph 1 of text item 2 of numberPrices
    set the end of numberPriceOutput to "New ISBN: <a href=''> " & thisNumber & "</a>  <b>" & thisPrice & "</b> (with 100% Pure AppleScript Discount)"
  on error
    set the end of numberPriceOutput to "New ISBN: <a href=''> " & thisNumber & "</a>  <b> <i>Price not found</i></b> (with 100% Pure AppleScript Discount)"
  end try
end repeat
set endTime to time of (current date)
set elapsedTime to startTime - endTime

set the end of numberPriceOutput to {"Elapsed time: " & elapsedTime, "</body>", ""}
set AppleScript's text item delimiters to "<BR>" & return
set numberPriceOutput to numberPriceOutput as string
try
  set openfile to open for access exportFile with write permission
on error
  close access exportFile
  set openfile to open for access exportFile with write permission
end try
set eof of openfile to 0
write numberPriceOutput to openfile
close access openfile
tell application "Safari"
  open exportFile
  activate
end tell
return elapsedTime

on numberTest(stringToTest)
  try
    stringToTest as real
    return true
  on error
    return false
  end try
end numberTest

---------
=
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Help with find text command
      • From: Philip Aker <email@hidden>
  • Prev by Date: Getting the Server property from the Finder
  • Next by Date: Re: Help with find text command
  • Previous by thread: Re: Help with find text command
  • Next by thread: Re: Help with find text command
  • Index(es):
    • Date
    • Thread