• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Posix path and High Ascii Characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Posix path and High Ascii Characters


  • Subject: Re: Posix path and High Ascii Characters
  • From: Christopher Nebel <email@hidden>
  • Date: Mon, 9 Sep 2002 22:03:47 -0700

On Monday, September 9, 2002, at 04:46 PM, alain content wrote:

1. you wrote

Actually, the problem is with "do shell script", not "POSIX path". It
doesn't understand the proper text encoding to use with shell commands.
The simplest solution is to upgrade to 10.2 Jaguar, which fixes the
bug. Failing that, a technique has been described here a few times to
mangle the path into UTF-8 by hand.

Under 10.2, do shell script still fails when the path contains high asciis,
e.g.

do shell script "cd '~/Desktop/photos e'te''" or
do shell script quoted form of "cd ~/Desktop/photos e'te'" or

Both return an error (sh: cd ~/Desktop/photos e'te' : no such file or
directory)

The second has to do with misquoting -- it's complaining that there's no such file as "cd ~/Desktop/photos ete". The "quoted form" needs to go around just the file, not the whole command.

The first has to do with the composed/decomposed ambiguity I mentioned earlier. For slightly complicated reasons, you're passing a composed e-acute to the shell, but there isn't any file with that name: its name is e+acute mark. This is going to be a problem with just about any hardcoded path like this.

However, the following seems to work :

set thePath to POSIX path of "U:Users:ac:Desktop:photos e'te'"
do shell script "cd " & quoted form of thePath & ";pwd"
-- "/Users/ac/Desktop/photos e'te'

Because of the path through the APIs, this irons out the ambiguity properly.

2. None of these work when talking with Terminal, unfortunately :

set thePath to POSIX path of "U:Users:ac:Desktop:photos e'te'"
tell application "Terminal"
activate
do script "cd " & quoted form of thePath in window frontmost
end tell

Or
tell ...
do script ("echo " & quoted form of thePath) in window frontmost

Works fine for me, but it requires that the Terminal be set to UTF-8. Did you change yours to MacRoman?

4. You wrote:

-- the proper UTF-8 sequence for an e-acute is {101, 204, 129}.

Now, this is just plain curiosity, but what's the relation between that and what I'm seeing -- e\314\201 -- (except that perhaps 101 is 0065, hence the "e" ?)

Right -- I was using decimal to match JD's code.

Why is it that the very same folder seems to receive two different names/encodings in Terminal?
Let me explain:
When using the file completion props of tcsh to change dir, I get to

~/Desktop/photos e\314\201te\314\201

However, using the trick mentioned above,

~/Desktop/photos \303\251t\303\251

Again, this the composed vs. decomposed character difference. The former is decomposed, the latter is composed. The fact that both exist makes life decidedly more difficult, but that's how it is.

As I understand it, the original Unicode design had only base characters and combining marks. This was a superior design in several ways -- it's more flexible, easily allows for multiple accents on a single base (not relevant in French, but critical for some languages), and helps keep the total number of code points down. Unfortunately, it was also not trivially compatible with the already existing ISO-8859 encodings -- a real issue if you've got lots of existing data. That didn't sit well with various Unicode consortium members, so they added pre-composed compatibility characters, producing the system we have now.


--Chris Nebel
AppleScript Engineering
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: Posix path and High Ascii Characters
      • From: John Delacour <email@hidden>
References: 
 >Re: Posix path and High Ascii Characters (From: alain content <email@hidden>)

  • Prev by Date: Re: Bug in AS for Mac OS X?
  • Next by Date: Setting and getting a file's icon in OS X?
  • Previous by thread: Re: Posix path and High Ascii Characters
  • Next by thread: Re: Posix path and High Ascii Characters
  • Index(es):
    • Date
    • Thread