Re: Strange QP
Re: Strange QP
- Subject: Re: Strange QP
- From: Paul Berkowitz <email@hidden>
- Date: Tue, 24 Sep 2002 11:24:13 -0700
On 9/24/02 10:18 AM, "email@hidden" <email@hidden> wrote:
>
On Mon, 23 Sep 2002 19:26:52 -0700From: Paul Berkowitz <email@hidden>
>
asked,
>
>
> [...] OK, the
>
> character I'm asking about is the bullet, made on a US
>
> keyboard with option-8. That's the one whose QP is "=80".
>
> ASCII 128 on the Mac is "A" with an umlaut, on on Windows
>
> is the Euro symbol, which is why I was wondering whether
>
> the 128 code might have been recently assigned to it.
>
> On Windows, the bullet's ASCII is 149, on Mac it's 165.
>
>
You're getting tricked up by the Windows extension to the
>
ISO-8859-1 (Latin-1) character set. The row of characters
>
from 80 to 9f hex, 128 to 159 decimal, are formally "not
>
defined" in ISO-8859-1. As I understand it, this was done
>
as a safety precaution, because these characters are the
>
high-bit-set partners of the ASCII control characters.
>
Its desirable that if you have a seven bit connection, you
>
don't get control characters (like "clear screen," or "stop
>
transmitting" when text was intended. But Microsoft has
>
added some glyphs. (The superset is called Code Page 1252).
>
>
Apple also had their 8-bit character set, MacRoman, which
>
predated ISO-8859-1, containing some glyphs Windows CP1252
>
didn't have (like fi and fl ligatures, and heart, while
>
Latin-1 had fractions like 1/2 and 1/4, and thorn (gotta
>
keep Iceland happy).
>
>
Bullet is not defined in Latin-1, although decimal 183
>
(B7 hex) is multiplication dot, a bit smaller. Windows
>
added the bullet at 149 (95 hex). MacRoman has the
>
bullet at (A5 hex), which would give you a yen sign
>
in Latin-1.
Scott,
Thanks a lot for replying. it turns out (from another source) that the
quoted-printable "=80", i.e. decimal 128, is the correct coding for it in
something called MacLatin, whatever that may be: evidently yet another
extension to Latin-1. In MacRoman it's 165. I don't know what receiving
email clients are going to interpret "=80" as MacLatin, but I bet it's not
too many
>
>
The Euro symbol was recently added to CP1252 at 128
>
decimal (80 hex), but it was not added to Latin-1.
>
A modified version of Latin 1,ISO 8859-15, (confusingly
>
known as Latin-9 and sometimes Latin 0) adds the Euro,
>
plus some missing French and Finnish accented characters,
>
at the expense of marginally useful Latin-1 characters
>
like the broken vertical bar (hex A6), the fractions, and
>
naked accent characters. But in Latin 9, the Euro is a
>
A4 hex (in place of the "generic currency" symbol).
I've heard about that from a third source. When you send a Euro symbol from
Entourage, it decides to send it in UTF-8, although not many email clients
are going to understand that yet and will probably just display it as "?".
But when you send a bullet, Entourage sends it as ISO-8859-1 (Latin-1) with
quoted-printable, converting the bullet to "=80". Other Macs (not this dumb
mailing list, of course) interpret that correctly and display it as a
bullet. Some Windows clients see it correctly as 128 but display it as a
Euro, as it happens, since that's what Windows 1252 says it is..
>
>
The Quoted-Printable character coded =80 is the Euro symbol
>
in CP1252, A-umlaut in MacRoman, and an undefined control
>
character in Latin-1. It all depends on the Content-Type
>
specified in the header. (If you have a Content-transfer-
>
encoding: Quoted-Printable header, you gotta have a
>
Content-Type: text/plain; charset=xxxx as well.)
>
Because of the bullet, it was sent in ISO-8859-1. Plaintext without fancy
characters usually gets sent as 7-bit US-ASCII.
>
> Yucch.
>
>
Yup. There are glyphs in MacRoman that don't exist in
>
Latin-1, and glyphs in Latin-1 and in CP1252 that don't
>
exist in MacRoman. These are the difficult problems to
>
solve without Unicode. The code scrambling (such as
>
different codes for the bullet between MacRoman and CP1252)
>
is not that challenging, as long as the Content-Type:
>
MIME header is preserved and has the correct character set
>
specified.
Unicode is the way to go. the sooner all email clients get there, the
better.
>
>
But maybe I shouldn't say "not that challenging," since
>
the applescript-users list server can't handle the text
>
of an AppleScript. I guess its rocket science after all.
This mailing list server is purposely configured to be stupid, and to not
understand Macintosh computers. it could be configured to understand us, but
it isn't. One reason I was given, several times, is that it is configured
for "standard" behavior, not current Macs. The Microsoft news Server, which
must get 99% of its messages from Windows, nevertheless has no problem with
bullets and other non-ASCII characters sent from Macs. it's been configured
properly, as this one could be. But we've been down this road before.
They're just not interested.
--
Paul Berkowitz
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.