• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: "read from" and non-lo-ascii characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "read from" and non-lo-ascii characters


  • Subject: Re: "read from" and non-lo-ascii characters
  • From: Sander Tekelenburg <email@hidden>
  • Date: Wed, 22 Jun 2005 14:07:50 +0200

At 01:32 -0700 UTC, on 2005/06/22, Chris Page wrote:

[...]

> Since this is one of my hot-button items, I'll chime in, too: "hi" or
> "low" ASCII are misleading terms for "not ASCII".

[...]

> Furthermore, most of the interesting modern character sets / encodings
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
[...]

> Just to paint a clear picture of how commonly used encodings differ and
> why it's important to be specific, the most common character sets /
> encodings in Mac programming are:

[...]

> I'm hoping all this detail will convince everyone that being vague
> about encodings -- and in particular, using the
> not-as-neutral-as-you-might-think, assumption-laden "high ASCII" -- is
> fraught with peril.

Hear, hear. Well said.

However, your mixing of "character set" and "character encoding" calls for a
small addendum ;)

There are 2 different relevant aspects:
[1] character sets = groups of characters (ASCII, Unicode, etc.)
[2] encoding methods (to transport documents safely between different systems)

Strictly speaking "Character set" refers to [1] but it is often confused with
"charset", which is something else, namely the name of the relevant
content-type header's attribute, the value of which indicates *both* which
character set applies and which encoding does. For example, "Unicode" is one
such character set and the charset values "utf-8" and "utf-16" both indiciate
the exact same Unicode, but encoded differently.

To avoid the confusion caused by this, Alan Flavell
<http://ppewww.ph.gla.ac.uk/~flavell/charset/> coined the phrase "character
repertoire" when referring strictly to to "character sets" and not to
encoding methods, which I think describes it well. It at least avoids the
confusion that "character set", "charset" or "character encoding" tend to
lead to.


--
Sander Tekelenburg, <http://www.euronet.nl/~tekelenb/>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >Re: "read from" and non-lo-ascii characters (From: Matt Neuburg <email@hidden>)
 >Re: "read from" and non-lo-ascii characters (From: Christopher Nebel <email@hidden>)
 >Re: "read from" and non-lo-ascii characters (From: Chris Page <email@hidden>)

  • Prev by Date: Closing a window
  • Next by Date: Movie Script in Quicktime 7.0
  • Previous by thread: Re: "read from" and non-lo-ascii characters
  • Next by thread: Re: "read from" and non-lo-ascii characters
  • Index(es):
    • Date
    • Thread