Re: Weird error
Re: Weird error
- Subject: Re: Weird error
- From: Christopher Nebel <email@hidden>
- Date: Tue, 30 Sep 2003 11:03:34 -0700
On Sep 29, 2003, at 1:57 PM, Steve Mills wrote:
On Monday, Sep 29, 2003, at 14:57 US/Central, Jean-Baptiste wrote:
Could someone explain to me what a 'BOM' is ?
Byte Order Mark. The order of the bytes in each Unicode character.
It's at the front of most Unicode files. It's either 0xfffe or 0xfeff,
which mean the bytes are in DOS order or Mac order respectively. As
Unicode, the ASCII character 'A' would look like 0x0065 on a Mac and
0x6500 on DOS.
Steve's explanation obscures a few details, but is essentially correct.
For the terminally fussy:
"BOM" does indeed stand for "Byte Order Mark". The problem is that (a)
all Unicode encodings other than UTF-8 use multi-byte integers for
their code units -- for instance, UTF-16 uses 16-bit integers, and (b)
different computers use different byte orders for multi-byte integers.
[1] Therefore, if you try to send data from one part of the "world" to
another, the receiver may interpret the bytes the other way around and
get gibberish.
The solution is the BOM: it's a special "character" at the front of the
text. Its code point is 0xfeff, so if you see 0xfffe (which is defined
to be illegal, so it'll never show up under any other circumstances),
then you know you've got things the wrong way around.
--Chris Nebel
AppleScript Engineering
[1] There are only two orders anyone uses: big-endian and
little-endian. (See
<
http://info.astrian.net/jargon/terms/b/big-endian.html>.) Motorola
and PowerPC use the former, Intel uses the latter. Say you've got a
2-byte integer, 0x1234. That's two bytes, 0x12 and 0x34. On a
big-endian system, the bytes show up in memory most-significant (i.e.,
the big end) first, so you get 0x1234. On a little-endian system, it's
the other way around: 0x3412.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.