I went looking for a urlEncode script that I could call from AppleScript and found this handler called urlEncode()
I believe that it may work OK, but I would like to understand what's going on better than I do. .
-- This script uses the perl command from the shell to do a URL Encoding
on urlEncode(str)
local str
try
return (do shell script "/bin/echo " & quoted form of str & " | perl -MURI::Escape -lne 'print uri_escape($_)'")
on error eMsg number eNum
error "Can't urlEncode: " & eMsg number eNum
end try
end urlEncode
-- Does this make sense? The id of the character is only 233 but the percent-encoding makes it look
-- like the accented é takes two bytes to encode.
-- Does it have anything to do with whether the character encoded here is UTF-8, UCS-2, UTF-16, or UTF-32?
set this_character to "é"
set this_encoding to urlEncode(this_character)
set this_id to id of (this_character)
display dialog "The character: " & this_character & " is percent-encoded with: " & this_encoding & " and it's id is: " & this_id
I understand that ASCII has been deprecated from AppleScript and that nowadays that everything is Unicode text = text = string, but I believe that what's different is UTF-8 which is what I want.
Each of the following expressions returns 233:
id of ("é" as string)
id of ("é" as text)
id of ("é" as Unicode text)
id of ("é" as «class utf8»)
I read that nowadays instead of calling ASCII number that we are supposed to use "id of" instead.
I would like to develop my own urlEncode() handler in pure AppleScript so that I can understand how. I've googled and found some that don't really work for UTF-8 because they assume that all characters are 8-bits wide.