I went looking for a urlEncode script that I could call from AppleScript and found this handler called urlEncode() I believe that it may work OK, but I would like to understand what's going on better than I do. .
-- This script uses the perl command from the shell to do a URL Encoding on urlEncode(str) local str try return (do shell script "/bin/echo " & quoted form of str & " | perl -MURI::Escape -lne 'print uri_escape($_)'") on error eMsg number eNum error "Can't urlEncode: " & eMsg number eNum end try end urlEncode
-- Does this make sense? The id of the character is only 233 but the percent-encoding makes it look -- like the accented é takes two bytes to encode. -- Does it have anything to do with whether the character encoded here is UTF-8, UCS-2, UTF-16, or UTF-32? set this_character to "é" set this_encoding to urlEncode(this_character) set this_id to id of (this_character) display dialog "The character: " & this_character & " is percent-encoded with: " & this_encoding & " and it's id is: " & this_id
I understand that ASCII has been deprecated from AppleScript and that nowadays that everything is Unicode text = text = string, but I believe that what's different is UTF-8 which is what I want.
Each of the following expressions returns 233:
id of ("é" as string) id of ("é" as text) id of ("é" as Unicode text) id of ("é" as «class utf8»)
I read that nowadays instead of calling ASCII number that we are supposed to use "id of" instead.
I would like to develop my own urlEncode() handler in pure AppleScript so that I can understand how. I've googled and found some that don't really work for UTF-8 because they assume that all characters are 8-bits wide.
|