• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: "read from" and non-lo-ascii characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "read from" and non-lo-ascii characters


  • Subject: Re: "read from" and non-lo-ascii characters
  • From: Matt Neuburg <email@hidden>
  • Date: Tue, 21 Jun 2005 15:07:00 -0700

On Mon, 20 Jun 2005 16:21:09 -0700, Christopher Nebel <email@hidden>
said:
>On Jun 20, 2005, at 1:36 PM, jj wrote:
>
>>> When I tried to read the file using "until" or "before" I found that they
>>> didn't work with those characters. Reading until or before "e" or "i" worked
>>> fine, but not until or before option-1 or option-2. Text encodings were fine
>>> (MacRoman was being used throughout, and when I read the whole file, the
>>> characters in question came in perfectly), so I can only conclude that
>>> reading "until" or "before" is just plain broken in case the character being
>>> used is not lo-ascii.
>>>
>> Seems a bug. I tried encoding a file as Unicode text, then reading until
>> "blah" as Unicode text. Seems that it recognizes the delimiters, but it will
>> skip some characters... So, more bugs...
>
>matt's problem is indeed a bug.  It broke in Panther, in an attempt
>to improve the behavior of reading UTF-16 files using delimiters; to
>my knowledge, no one has reported it before.  If you use a UTF-16
>file, however -- add "as Unicode text" to the appropriate "read" and
>"write" commands -- it should work.

Excellent! That worked fine. The original export could not be done as
UTF-16, but it was easily converted from MacRoman to UTF-16, and then "read
until" works as advertised with these characters. Thanks for this
suggestion. m.

>P.S.: Obligatory pedantry: there is no such thing as "high" or "low"
>ASCII.  ASCII defines interpretations for bytes in the range
>0...0x7F.  If it's not in that range, it's not ASCII.

PPS: Obligatory pedantry from a professional linguist: Language is usage.
The terms lo-ascii and hi-ascii were used consistently and meaningfully
throughout the 80s and 90s. Furthermore, you knew *exactly* what my
terminology meant, thus bearing witness against yourself. The defense rests.
m.

--
matt neuburg, phd = email@hidden, <http://www.tidbits.com/matt/>
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide
<http://www.amazon.com/exec/obidos/ASIN/0596005571/somethingsbymatt>



 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: "read from" and non-lo-ascii characters
      • From: Christopher Nebel <email@hidden>
  • Prev by Date: Re: scripting barcodes
  • Next by Date: Re: Open .term Files in 10.3
  • Previous by thread: Re: "read from" and non-lo-ascii characters
  • Next by thread: Re: "read from" and non-lo-ascii characters
  • Index(es):
    • Date
    • Thread