• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: URL Encoding With Pure AppleScript
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: URL Encoding With Pure AppleScript


  • Subject: Re: URL Encoding With Pure AppleScript
  • From: Ron Hunsinger <email@hidden>
  • Date: Wed, 10 Jul 2013 14:53:13 -0700


On Jul 9, 2013, at 11:45 PM, Kaydell Leavitt <email@hidden> wrote:
Does this make sense?  The id of the character is only 233 but the percent-encoding makes it look like the accented é takes two bytes to encode.

That's correct for UTF-8. UTF-8 uses one byte to encode codepoints 0 to 127 (ASCII), two bytes to encode codepoints 128 to 2047, three bytes to encode codepoints 2048 to 65535, and four bytes for codepoints 65536 and up.

The range of characters that can be encoded in two bytes includes Latin-1 (which is where you'll find é). Even though Latin-1 codepoints are small enough to fit in a single byte, they can't be coded that way and still leave room to encode the fact of it being only one byte.

If the binary representation of a codepoint is xxx...xxx, UTF-8 uses the shortest of the following sequences that has enough xs to contain the value:

0xxxxxxx
110xxxxx 10xxxxxx
1110xxxx 10xxxxxx 10xxxxxx
11110xxx 10xxxxxx 10xxxxxx 10xxxxxxx

233 in binary is 11100111, which UTF-8 breaks into 00011_100111 and encodes as 110_00011 10_100111.

This isn't a consequence of using combining marks. If you decomposed é into LATIN SMALL LETTER E (69) followed by COMBINING ACUTE ACCENT (301), UTF-8 would need three bytes: one for the e (which wouldn't need to be URL-encoded) plus two more for the accent.

-Ron Hunsinger
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users

This email sent to email@hidden

References: 
 >AppleScript fails in Automator - Permissions error (From: Lists <email@hidden>)
 >Re: AppleScript fails in Automator - Permissions error (From: "Manoah F. Adams" <email@hidden>)
 >Re: AppleScript fails in Automator - Permissions error (From: Lists <email@hidden>)
 >Re: AppleScript fails in Automator - Permissions error (From: Kaydell Leavitt <email@hidden>)
 >Re: AppleScript fails in Automator - Permissions error (From: Lists <email@hidden>)
 >Re: AppleScript fails in Automator - Permissions error (From: Kaydell Leavitt <email@hidden>)
 >URL Encoding With Pure AppleScript (From: Kaydell Leavitt <email@hidden>)

  • Prev by Date: Re: URL Encoding With Pure AppleScript
  • Next by Date: Start HomeSync with AS
  • Previous by thread: Start HomeSync with AS
  • Next by thread: Scripting iBooks Author (iAuthor)
  • Index(es):
    • Date
    • Thread