Re: Automator bug in "run shell script" ?
Re: Automator bug in "run shell script" ?
- Subject: Re: Automator bug in "run shell script" ?
- From: Ron Hunsinger <email@hidden>
- Date: Wed, 15 Jun 2011 12:02:39 -0700
On Jun 15, 2011, at 3:25 AM, Jean-Christophe Helary wrote:
>
> On 15 juin 2011, at 18:38, Ron Hunsinger wrote:
>
>> The problem isn't in Automator, AppleScript, or "do shell script".
>>
>> As I said, the problem is that:
>> a) HFS+ forces all filenames to their canonical fully decomposed form
>
> You never mentioned HFS+ earlier in the conversation.
>
> And I never mentioned or suggested I was working with filenames.
>
> I've read about that HFS+ thing, and saw plenty of explanations like you wrote, but that still does not tell me why the "run shell script" action of Automator does not produce the same output as the exact same command launched in Terminal with exactly the same string.
I said, in my first contribution to this thread:
> Apple is forced to expand characters in filenames into a canonical form, so that filenames with the same characters don't wind up on disk with different codepoint sequences. The canonical form they chose was to expand all characters into their fully decomposed form, in which any combining code that can be separated from the base character is so separated, and the combining codes (there may be more than one) are in a standard order.
I mention HFS+ only to tell you why and by when the expansion to canonical form must occur. If it hasn't happened already by the time it gets to HFS+, HFS+ will do it. Other filesystems have the same problem, that a given character may have more than one representation, but I don't know at what layer the solution is imposed. (Or if. HFS just ignored the problem, by assuming one character = one byte.) But c'mon, which filesystem did you think I was talking about?
Automator may do that expansion earlier, whenever it thinks a string it's dealing with is (or might be?) a filename. For example, if you feed the output of an action that produces files/folder to a Run AppleScript action, the filenames will have already been decomposed before they reach AppleScript. That's the "sometimes" I referred to earlier. In most cases, this makes things easier for the user.
I see belatedly that you asked about the Run Shell Script action, and I had it in my head that you were asking about "do shell script", as might appear in a Run AppleScript action. When I test with Run Shell Script , I see that the script itself has been fully decomposed. This is good! It means the problems I described with Terminal do not appear with Run Shell Script.
This should make your work easier, not harder. It's still not trivial, of course. If you want to deal directly with codepoints rather than characters, you'll have to quote them. Use the $'...' syntax, which enables the \ooo and \xXX octal and hex escapes, so you can spell out the UTF8 sequence byte-for-byte. For example:
echo Gesprächen ist Gespr$'\xC3\xA4'chen.
echo Hiragana 'ga' is $'\xE3\x81\x8C'.
Bear in mind that this puts the burden on you to worry about encoding methods and byte ordering. If you want to use grep to search a file produced elsewhere, it's up to you to know if the file was stored as UTF8, UTF16, or UCS4, and in the latter cases whether it's big-endian or little-endian. grep deals only with bytes.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Automator-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden