Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: "read from" and non-lo-ascii characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "read from" and non-lo-ascii characters

Subject: Re: "read from" and non-lo-ascii characters
From: Matt Neuburg <email@hidden>
Date: Tue, 21 Jun 2005 15:07:00 -0700

On Mon, 20 Jun 2005 16:21:09 -0700, Christopher Nebel <email@hidden>
said:
>On Jun 20, 2005, at 1:36 PM, jj wrote:
>
>>> When I tried to read the file using "until" or "before" I found that they
>>> didn't work with those characters. Reading until or before "e" or "i" worked
>>> fine, but not until or before option-1 or option-2. Text encodings were fine
>>> (MacRoman was being used throughout, and when I read the whole file, the
>>> characters in question came in perfectly), so I can only conclude that
>>> reading "until" or "before" is just plain broken in case the character being
>>> used is not lo-ascii.
>>>
>> Seems a bug. I tried encoding a file as Unicode text, then reading until
>> "blah" as Unicode text. Seems that it recognizes the delimiters, but it will
>> skip some characters... So, more bugs...
>
>matt's problem is indeed a bug.  It broke in Panther, in an attempt
>to improve the behavior of reading UTF-16 files using delimiters; to
>my knowledge, no one has reported it before.  If you use a UTF-16
>file, however -- add "as Unicode text" to the appropriate "read" and
>"write" commands -- it should work.

Excellent! That worked fine. The original export could not be done as
UTF-16, but it was easily converted from MacRoman to UTF-16, and then "read
until" works as advertised with these characters. Thanks for this
suggestion. m.

>P.S.: Obligatory pedantry: there is no such thing as "high" or "low"
>ASCII.  ASCII defines interpretations for bytes in the range
>0...0x7F.  If it's not in that range, it's not ASCII.

PPS: Obligatory pedantry from a professional linguist: Language is usage.
The terms lo-ascii and hi-ascii were used consistently and meaningfully
throughout the 80s and 90s. Furthermore, you knew *exactly* what my
terminology meant, thus bearing witness against yourself. The defense rests.
m.

--
matt neuburg, phd = email@hidden, <http://www.tidbits.com/matt/>
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide
<http://www.amazon.com/exec/obidos/ASIN/0596005571/somethingsbymatt>

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

Follow-Ups:
- Re: "read from" and non-lo-ascii characters
  - From: Christopher Nebel <email@hidden>

Prev by Date: Re: scripting barcodes
Next by Date: Re: Open .term Files in 10.3
Previous by thread: Re: "read from" and non-lo-ascii characters
Next by thread: Re: "read from" and non-lo-ascii characters
Index(es):
- Date
- Thread