Re: ASCII vs. MacRoman (was Re: Standard Additions 'read' command - basic questions)
Re: ASCII vs. MacRoman (was Re: Standard Additions 'read' command - basic questions)
- Subject: Re: ASCII vs. MacRoman (was Re: Standard Additions 'read' command - basic questions)
- From: Walter Ian Kaye <email@hidden>
- Date: Mon, 19 Jan 2004 16:07:16 -0800
At 02:21p -0800 01/19/2004, Chris Page didst inscribe upon an
electronic papyrus:
On Jan 19, 2004, at 12:36, Christopher Nebel wrote:
The problem is that your delimiter string is being fetched as
Unicode, but then it looks for the Unicode code point in the data,
which is *not* 254 (or 255, or whatever), since your primary
encoding is probably MacRoman, which doesn't agree with Unicode at
all above 127. (127 and below -- that is, ASCII -- is fine.)
I realize this is a separate topic, but: In fact, it's misleading
for "ASCII character" to work with values above 127, which are not
ASCII. It's too bad people have played fast-and-loose with the term
ASCII lo these many decades, including nearly every mainstream
programming language. Really, "ASCII character" should produce an
error if the numeric value is not valid ASCII. Either it should have
been named "MacRoman character" or some other mechanism should have
been created for handling other character encodings.
In fact, it's not too late. It might be useful to rename it to
something like "MacRoman character" and provide "ASCII character" as
a synonym (so "ASCII character" could still be used, but it would
decompile to "MacRoman character").
On the BBEdit list, I once wished for the "ASCII Table" palette to be
renamed to something like "Character Table" for that reason. Hmm...
maybe I should send my wish directly to support...
I'll file a bug, but in the meantime, try using FS (field
separator, ASCII character 28) and RS (record separator, ASCII
character 30) instead -- that's what they were designed for.
I've always wondered where this is defined.
Well... in the spec, of course. ;)
(Wish I could locate my copy. You could always order your own if you
want to shell out the bucks.)
Every piece of documentation I've seen on ASCII fails to fully
describe the control characters and their meanings (though it's easy
to guess what FS and RS are for). Have you ever seen a detailed
definition for FS and RS anywhere?
Detailed? Nope, not detailed. I suppose we'd need to find out who was
on the committee which defined it, and see if they still have any
written notes...
I'd really like to see better documentation on ASCII.
I would too, if only to end the long-running grave-vs-quote debate.
(I say: since it's on the tilde key, it's a grave accent. Period.)
I'd also like to know why the 8-bit ASCII standard was withdrawn.
-boo
http://www.natural-innovations.com/computing/asciiebcdic.html
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.