• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Unicode/UTF confusion
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode/UTF confusion


  • Subject: Unicode/UTF confusion
  • From: Cameron Smith <email@hidden>
  • Date: Fri, 2 May 2003 15:12:28 -0700

From ApplieScript 1.9.0 Release Notes (http://www.apple.com/applescript/release_notes/190OSX.html): To convert a string value to the UTF-8 format, use the coercion as <<class utf8>>. So:
set s to s as <<class utf8>>

We also have:
set s to s as Unicode text

In either case, if I:
set s to "[open curly quote]test[close curly quote]"
[convert using either method above]
write s to theFile

I get, according to a BBEdit hex dump, a file that looks like:

20 1C 00 74 00 65 00 73 00 74 20 1D

which makes some sense in that 201C is the Unicode code point for [open curly quotes], 74 is a Unicode (and ASCII) "t", 65 is "e", 73 is "s", 74 is "t" again, and 201D is [close curly quotes].

But I understand it, this is not UTF-8, which is a variable length, C-compatible encoding in which a "t" should appear simply as 74, not as 00 74, and the [open curly quotes] appears as something totally different. This isn't even a proper UTF-16 encoding, is it? (Although opening it as UTF-16 is the only way that BBEdit will open it and display it properly.)

So what's going on? And can I get a proper UTF-8 encoding into a file?


--
Cameron Smith
Cutting Edge Technology Services, Inc.
Nanaimo, BC
http://www.cetsi.com/
tel: 1.250.729.9515 fax: 1.250.729.8201
------------------------------------------------
Websites ** Print Production ** Editorial Services
Publishers of the http://SaltSpringNews.com/
------------------------------------------------
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: Unicode/UTF confusion
      • From: Christopher Nebel <email@hidden>
  • Prev by Date: RE: Setting Page Setup in Quark
  • Next by Date: Re: When a string is not a string for Palm Desktop
  • Previous by thread: Re: Strange numbers to AS
  • Next by thread: Re: Unicode/UTF confusion
  • Index(es):
    • Date
    • Thread