Re: Please Help! Reading Unicode Text
Re: Please Help! Reading Unicode Text
- Subject: Re: Please Help! Reading Unicode Text
- From: Paul Berkowitz <email@hidden>
- Date: Mon, 29 Mar 2004 02:13:48 -0800
On 3/29/04 1:05 AM, "Rob Stott" <email@hidden> wrote:
>
I have a file containing text in a foreign language (Turkish) - the
>
file reads fine in TextEdit, BBEdit etc. However, when I read it into
>
an Applescript, some characters go a bit peculiar. I should maybe point
>
out that I'm reading the file using...
>
>
do shell script "cat .... "
>
>
Anyway, what happens to the text is that accents etc appear as separate
>
characters to the characters they belong to. For example, the 'c' with
>
the squiggle below it (as in the french word 'gargon') appears as two
>
characters, a standard letter 'c' followed by a second character which
>
is just the squiggle.
>
>
A colleague had a similar problem and found a work-around by literally
>
opening the file in TextEdit, then copying and pasting the text.
>
Although this works, I'm sure there must be a better solution. I'm not
>
too familiar with unicode, maybe I'm making a daft mistake...
>
>
Does anyone have any ideas or suggestions? Any help much appreciated.
Assuming the text actually is in Unicode (UTF-16? 8?) as you believe, read
the file this way:
set r to read alias theFilePath as Unicode text
If it's not Unicode 16 you'll have to do some converting first. That does
require a shell script. Check 'man inconv' in the Terminal.
--
Paul Berkowitz
[demime 0.98b removed an attachment of type application/pkcs7-signature which had a name of smime.p7s]
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.