Re: Unicode/UTF confusion
Re: Unicode/UTF confusion
- Subject: Re: Unicode/UTF confusion
- From: Christopher Nebel <email@hidden>
- Date: Fri, 2 May 2003 15:53:12 -0700
On Friday, May 2, 2003, at 03:12  PM, Cameron Smith wrote:
From AppleScript 1.9.0 Release Notes
(http://www.apple.com/applescript/release_notes/190OSX.html): To
convert a string value to the UTF-8 format, use the coercion as
<<class utf8>>. ... if I:
	set s to "[open curly quote]test[close curly quote]"
	[convert using either method above]
	write s to theFile
I get, according to a BBEdit hex dump, a file that looks like:
	20 1C 00 74 00 65 00 73 00 74 20 1D
But I understand it, this is not UTF-8, which is a variable length,
C-compatible encoding in which a "t" should appear simply as 74, not
as 00 74, and the [open curly quotes] appears as something totally
different. This isn't even a proper UTF-16 encoding, is it? (Although
opening it as UTF-16 is the only way that BBEdit will open it and
display it properly.)
That's proper UTF-16.  (It doesn't have a byte-order mark, but that's
not required.)  I'm not sure who wrote that note, but the second
sentence is very misleading.  There's no support for storing data
within AppleScript itself in UTF-8, so saying "s as <<class utf8>>" has
no real effect.  However, you can make it write UTF-8 to a file by
saying this:
	write s to theFile as <<class utf8>>
--Chris Nebel
Apple Development Tools
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: 
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.