• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: urlEncode (AppleScript fails in Automator - Permissions error)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: urlEncode (AppleScript fails in Automator - Permissions error)


  • Subject: Re: urlEncode (AppleScript fails in Automator - Permissions error)
  • From: Christopher Nebel <email@hidden>
  • Date: Thu, 01 Aug 2013 12:05:01 -0700

On Jul 9, 2013, at 10:49 PM, Kaydell Leavitt <email@hidden> wrote:

> I went looking for a urlEncode script that I could call from AppleScript and found this handler called urlEncode()
> I believe that it may work OK, but I would like to understand what's going on better than I do. ...
>
> I understand that ASCII has been deprecated from AppleScript and that nowadays that everything is Unicode text = text = string, but I believe that what's different is UTF-8 which is what I want.
>
> Each of the following expressions returns 233:
>
> id of ("é" as string)
> id of ("é" as text)
> id of ("é" as Unicode text)
> id of ("é" as «class utf8»)
>
> I read that nowadays instead of calling ASCII number that we are supposed to use "id of" instead.
>
> I would like to develop my own urlEncode() handler in pure AppleScript so that I can understand how.  I've googled and found some that don't really work for UTF-8 because they assume that all characters are 8-bits wide.

All your guesses so far are correct; it sounds like you just need some confirmation and a few missing pieces.  I wouldn't say that ASCII has been deprecated from AppleScript, but it's true that, as you said, Unicode text = text = string, or as I put it, text in AppleScript is all Unicode all the time.  Related to that, the "ASCII character" and "ASCII number" commands have been genuinely deprecated -- they don't produce reliable results for non-ASCII characters -- and you should be using "id of" instead.

All those "id of" expressions return 233, because that's the decimal Unicode code point for a pre-composed "é" character.  The trick is that a particular code point may be represented as bytes in a number of different ways depending on the encoding used.  For URLs, the standard is to encode the string as UTF-8, and then escape any non-allowed bytes.  Any non-ASCII character in UTF-8 will take at least two bytes -- "é" is C3 A9 -- so the resulting URL fragment would be "é".  (Tip: the Character Viewer can tell you the UTF-8 encoding for any character.  You can turn it on in Keyboard Preferences.)

As for doing this in "pure" AppleScript, you'll have to write code to do UTF-8 encoding yourself, which will involve some bit-wise math that AppleScript is unfortunately not well suited for.  I use "pure" advisedly, because there's no particular reason to necessarily *not* shell out to perl, since both "do shell script" and perl(1) are guaranteed to be present -- unless, that is, you have a demonstrated performance problem with doing so, or if it's a matter of principle.


--Chris Nebel
AppleScript Engineering
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden


  • Prev by Date: Re: Copying text from the AS Editor
  • Next by Date: Re: Moving & deleting a mailbox in Mail
  • Previous by thread: Re: Copying text from the AS Editor
  • Next by thread: How can I speed up execution time of this script in case of many files
  • Index(es):
    • Date
    • Thread