On 08-05-21, at 04:32, Mark J. Reed wrote:
Python and Tcl are locale-sensitive but more awkward to use from "do shell script" (e.g. can't write a loop without a newline in Python, can't supply a hunk of code to run as a command-line option to tclsh.
I posted a solution for Tcl code hunks a while back. Basically use the triple whammy <<< operator.
Here's a hunk example with output to the desktop. Here I supply the data via a variable:
set t to "I'm using the find text command from satimage.osax to search a block of text to find a string that fits a pattern defined as a regular _expression_. I have the basic regexp ISBN: 05-961-8253-7 working but I'm looking to refine it a little and, being a regexp newb, I'm wondering if what I want to do is even possible. The string(s) I'm looking for are in the following format:\n\n[1-5 digits][hyphen][1-7 digits][hyphen][1-7 digits][hyphen][1 digit (which may actually be an \"X\")]\n\nThis is the command that I have so far to match this:\n--\nfind text \"[[:digit:]]{1,5}-[[:digit:]]{1,7}-[[:digit:]]{1,7}-[[:digit:]X]{1}\" in theText with regexp and all occurrences\n--\nSeems to work fine up to a point. nestled within it: fsdfh123@8X452P340-07-294509-5zzzzzz999999.\nHowever, it occurred to me that the regexp could match this string: \"0-0-0-0\". Which is not at all what I want.\nI'm looking for 10 digit ISBNs in the block of text (which should always be 13 characters--10 digits divided ISBN: 0-596-00053-7z into 4 substrings by 3 hyphens). Is there a way that I can 0-596-00053-7 maintain the flexibility in the number of digits within each substring, but insist that the total number of characters in the matched string remain constant at 13?"
set isbns to (do shell script "tclsh <<< 'set contents {'" & quoted form of t & "'};\narray set imap {05-961-8253-7 {978-05-961-8253-7 $49.99} 0-596-00053-7 {978-0130603111 $65.00}}\nforeach found [regexp -inline -all -- {[[:<:]][[:digit:]X-]{13}[[:>:]]} $contents] {\n\tset new [lindex $imap($found) 0];\n\tset price [lindex $imap($found) 1];\n\tset http \"(New ISBN: <a href=''>$new</a>)\"\n\tregsub -- \\[\\[:<:]]$found\\[\\[:>:]] $contents \"$found $http <b>$price</b> (with membership discount)\" contents\n}\nset html {<html><head><title>New Listings</title><style type='text/css'>p {font-family:Trebuchet MS;}</style></head><body>\n}\nset LF [format %c 0xA];\nregsub -all -- [format %c 0xA] $contents </p>$LF<p> contents\nappend html {<h1>New Listings</h1>\n}\nappend html <p>$contents</p>\nappend html {\n</body>\n</html>\n}\nset f [open ~/Desktop/isbns.html w]\nputs $f $html\nclose $f\n'")