Re: do shell script "perl... " to find/replace in a string
Re: do shell script "perl... " to find/replace in a string
- Subject: Re: do shell script "perl... " to find/replace in a string
- From: Christopher Nebel <email@hidden>
- Date: Thu, 18 Mar 2004 02:11:43 -0800
On Mar 16, 2004, at 10:26 PM, Joseph Weaks wrote:
I am at a loss when trying to figure out the syntax for perl/unix
commands. Given three variables, what would be a one liner do shell
script command that replaced every occurrence of findString in
sourceText with replaceString? I'm looking for something like:
set shellCommand to "perl blah blah" & quoted form of sourceText &
"blah blah" & quoted form of findString & "blah blah" & quoted form of
replaceString & "blah blah blah"
set sourceText to do shell script shellCommand
Not sure perl is the correct unix command to use for this.
Secondly, is it possible to pass a unicode hex number as the
replaceString?
What I'm working towards is a font conversion from a language font
with stacking diacriticals to the unicode equivalent.
iconv and piconv can convert text from one encoding to another for you,
but if you care about how your diacritics are composed or decomposed,
you may have to do it yourself, since I don't think iconv gives that
level of control. In that case, perl is probably your best bet, since
it has a reasonable grasp of Unicode, unlike sed.
What would that shellCommand look like? I read through the archives, a
see there is no way to specify a unicode character by it's hex, but
I'm certain there is in a shell command. (I think it has to do with
declaring standard output as unicode and then print chr (1FFF) or
something?)
Something like that. Plus, you really can use Unicode constants in
AppleScript, but it's annoying:
set f to "aa"
set r to +data utxt00650301;
set s to "blah blaah blaah blah"
do shell script "echo " & quoted form of s & " | perl -pe 's/" & f &
"/" & r & "/g'"
--> "blah blih blih blah" (though the e-acutes probably won't survive
the list server...)
(This particular form also works with sed; just substitute "sed -e" for
the "perl -pe".) Alternatively, use Perl's Unicode escapes:
set f to "aa"
set r to "\\x{0065}\\x{0301}"
set s to "blah blaah blaah blah"
do shell script "echo " & quoted form of s & " | perl -pe 's/" & f &
"/" & r & "/g'"
--> "blah blih blih blah" (same thing)
...which also answers your question about replacing with a combining
character sequence. The search string is case-sensitive by default;
use Perl and add an "i" at the end, right after (or right before) the
"g". (sed can't do this.) Bear in mind that AppleScript can't display
some Unicode strings correctly, so don't freak out if it doesn't look
right in the result window. Write the data out to a file or some such;
it's correct.
--Chris Nebel
AppleScript Engineering
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.