Unicode
Unicode
- Subject: Unicode
- From: Paul Berkowitz <email@hidden>
- Date: Wed, 26 Dec 2001 16:21:25 -0800
I've been trying to figure out what is needed in AppleScript in order to
transcribe non-MacRoman text from Entourage into AppleScript and out to text
files, and then back into Entourage (which encodes all text as Unicode in
its UI) and retain the character format. This is using AppleScript
properties whose data type is stated to be in Unicode in Entourage's
dictionary. I was trying to see if it's necessary to include 'as Unicode
text' every single time text was read and set, or just some of the time.
What I have found is this:
AppleScript seems to need 'as Unicode text' when reading but not when
writing. This is true both for Entourage properties and for text files. I
added Japanese keyboard (super-easy in OS X - just go to International
System pref and add the keyboard), made an Entourage contact with
Japanese-character first and last names, and experimented.
I needed
tell theContact
set {fName, lName} to {first name as Unicode text, last name as
Unicode text}
end tell
and
set g to open for access file fp
set r to read g as Unicode text
close access g
But I did not need to set tab-delimited text using those fName, lName
variables:
set txt to fName & tab & lName
to be 'as Unicode text', nor did I need to write to the file 'as
Unicode text', nor did I need to set new variables to the parsed read
text 'as Unicode text' nor did I need to specify the first name and last
name properties of a new Entourage contact 'as Unicode text'.
Once they were read correctly , they were "remembered" correctly by
AppleScript when using the read values to set other values or properties.
Interestingly, the text file itself always consists of a few regular
MacRoman ASCII characters, but mostly upper- (non-) ASCII Latin characters,
like a Norwegian "a" with a circle above it and Greek beta and A with a
circumflex. (There's no point even trying to show you with this list server
as it is.) But it gets transcribed as Japanese when read 'as Unicode text'
by AppleScript.
Is this a "not fully-implemented" version of Unicode in AS 1.8? Or is this
how it's meant to work?
--
Paul Berkowitz