Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
- Subject: Re: Strange behavior with do shell script, the Unix-command sed and "»" or "«". Encoding problem?
- From: Emmanuel LEVY <email@hidden>
- Date: Tue, 21 Oct 2014 17:37:36 +0200
This encoding issue is really painful, and you will have to fix it if you do want to use sed, but are you aware of the Satimage osax and its "change" command?
If you could use Satimage osax it might make an easier solution.
Best,
Emmanuel
On Oct 21, 2014, at 2:30 PM, quark67 wrote:
> Hello, I have found a strange behavior in AppleScript, when using the Unix-command sed and "«" or "»" (in french, we use "«" and "»" as guillemots).
>
> Before, a little explanation about 'sed'. This is an Unix command for regex (regular expression) replacement. The synopsis I use is :
> sed -E 's/regex/replacement/g'
> (the -E option is for use modern Regex, and the '/g' is for global replacement, not just for the first match.)
>
> For this example, I with replace any occurrence of ' »' with '_»' (space+'»' is replaced by underscore+'»'). I know this can simply made by use of 'text item delimiters'. But here, for more complex replacement, I will use the 'sed' command.
>
> In Terminal : echo "Bonjour « hello » world" | sed -E 's/ »/_»/g'
> give : Bonjour « hello_» world
> as expected.
>
> In AppleScript, the same : set r to do shell script "echo \"Bonjour « hello » world\" | sed -E 's/ »/_»/g'"
> give : Bonjour « hello_» world
> as expected.
>
> But, in my code, I would replace not only ' »' but also ' :', ' ;' for example. This can be made by the regex : ' [»:;]' ([»:;] signify '»' or ':' or ';'). For simplify this example, I use only [»].
>
> Then in Terminal : echo "Bonjour « hello » world" | sed -E 's/ [»]/_»/g'
> give : Bonjour « hello_» world
> as expected.
>
> In AppleScript, the same : set r to do shell script "echo \"Bonjour « hello » world\" | sed -E 's/ [»]/_»/g'"
> give : Bonjour_¬ª´ hello_¬ªª world
> as NOT expected (encoding garbage).
>
> Can anybody explain what happened and how workaround this ?
> I know I can workaround this by two 'sed' command, like :
> sed -E 's/ »/_»/g' (for the '»')
> and then:
> sed -E 's/ [:;]/_»/g'
> for the others.
>
> But can I use [»:;] in one tell without the encoding garbage?
>
> Thanks if you can explain what happens and how disable the garbage.
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> AppleScript-Users mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
> Archives: http://lists.apple.com/archives/applescript-users
>
> This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden