• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?


  • Subject: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
  • From: quark67 <email@hidden>
  • Date: Tue, 21 Oct 2014 14:30:33 +0200

Hello, I have found a strange behavior in AppleScript, when using the Unix-command sed and "«" or "»" (in french, we use "«" and "»" as guillemots).

Before, a little explanation about 'sed'. This is an Unix command for regex (regular _expression_) replacement. The synopsis I use is :
sed -E 's/regex/replacement/g'
(the -E option is for use modern Regex, and the '/g' is for global replacement, not just for the first match.)

For this example, I with replace any occurrence of ' »' with '_»' (space+'»' is replaced by underscore+'»'). I know this can simply made by use of 'text item delimiters'. But here, for more complex replacement, I will use the 'sed' command.

In Terminal : echo "Bonjour « hello » world" | sed -E 's/ »/_»/g'
give : Bonjour « hello_» world
as expected.

In AppleScript, the same : set r to do shell script "echo \"Bonjour « hello » world\" | sed -E 's/ »/_»/g'"
give : Bonjour « hello_» world
as expected.

But, in my code, I would replace not only ' »' but also ' :', ' ;' for example. This can be made by the regex : ' [»:;]' ([»:;] signify '»' or ':' or ';'). For simplify this example, I use only [»].

Then in Terminal : echo "Bonjour « hello » world" | sed -E 's/ [»]/_»/g'
give : Bonjour « hello_» world
as expected.

In AppleScript, the same : set r to do shell script "echo \"Bonjour « hello » world\" | sed -E 's/ [»]/_»/g'"
give : Bonjour_¬ª´ hello_¬ªª world
as NOT expected (encoding garbage).

Can anybody explain what happened and how workaround this ?
I know I can workaround this by two 'sed' command, like :
sed -E 's/ »/_»/g' (for the '»')
and then:
sed -E 's/ [:;]/_»/g'
for the others.

But can I use [»:;] in one tell without the encoding garbage?

Thanks if you can explain what happens and how disable the garbage.

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

  • Follow-Ups:
    • Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
      • From: Emmanuel LEVY <email@hidden>
  • Prev by Date: Re: Scripting mail attachments in Yosemite
  • Next by Date: Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
  • Previous by thread: Script to unencode a GREP token into plain english in (used in InDesign not sure if I got in on Adobe forum ID board).
  • Next by thread: Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
  • Index(es):
    • Date
    • Thread