Re: Do shell script and special characters
Re: Do shell script and special characters
- Subject: Re: Do shell script and special characters
- From: Chris Espinosa <email@hidden>
- Date: Mon, 21 Oct 2002 12:02:21 -0700
On Sat, 19 Oct 2002 22:50:35 +0100, John Delacour
<email@hidden> wrote:
I expect this bug to exist this time next year unless by chance
someone from the shady AppleScript team reads this thread and
actually gives a damn. The last bug I reported formally (related to
this one) got me a reply after five weeks demonstrating quite clearly
that the "engineers" had not properly read the report and it took me
an hour to devise a method that would make it impossible for them to
avoid the issue.
Well, whether it's a "bug" or a design choice is subject to debate. By
definition, shell I/O in Mac OS X is encoded in UTF-8. You can of
course use shell commands to move or copy data between files and
devices in any encoding and in raw form, but to send text to the shell
or get output from it, the defined encoding is assumed to be UTF-8.
Doing a 'cat' of a file that doesn't contain UTF-8 is just like doing a
'cat' of a binary file; you get garbage -- in most cases, illegal
UTF-8. The Terminal is somewhat tolerant of garbage characters,
displaying the question-mark-in-a-diamond character for illegal UTF-8
characters. OSA (as it is OSA that is trying to do the coercion from
the shell's purported UTF-8 output to styled text that Script Editor
can display) is not as tolerant of garbage UTF-8, so it returns a
coercion error.
Here are some workarounds you can do today:
- Don't use shell commands to produce Mac-encoded text for return to
AppleScript. Use AppleScript's file read/write OSAXen, which are
designed to read Mac text files and return Mac formatted strings to
AppleScript.
- If you want to capture the output of a shell command that is not
guaranteed to produce UTF-8, postprocess it by piping it through vis -o
to get an escaped UTF-8 form. You can unescape it with unvis,
Unfortunately there's no shell commands for translating MacRoman to
UTF-8 or back.
If you want this bug fixed by this time next year (by our "shady" team
that actually does "give a damn") then I'd appreciate some
recommendations from you on what specific behavior you want when a
shell command produces illegal UTF-8 to be returned to AppleScript in a
string variable:
- Translate all illegal UTF-8 characters into "missing character"
symbols, a la what the Terminal does? (This is lossy and irreversable,
but won't error)
- Assume that an error in coercing shell output to styled text means
that it's supposed to be in current text encoding? (This will produce
unpredictable results on non-MacRoman files; sometimes you'll get one
form, sometimes another, depending on what data is in the file)
- Allow an 'as' or 'encoding' parameter on do shell script so you can
specify what encoding, if any, you expect the result to be in if it's
not UTF-8 (this would include 'as data' so you could, for instance, cat
a .jpeg file and get the raw .jpeg in an AppleScript data object)
Advice welcome.
Chris
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.