Re: Weird error
Re: Weird error
- Subject: Re: Weird error
- From: Jean-Baptiste <email@hidden>
- Date: Tue, 30 Sep 2003 23:09:12 +0200
I was aware of big-endian vs little-endian but not about the 'BOM' (but
now I am). thanks for the explanation.
@+ JB
Le mardi, 30 sep 2003, ` 20:03 Europe/Paris, Christopher Nebel a icrit :
On Sep 29, 2003, at 1:57 PM, Steve Mills wrote:
On Monday, Sep 29, 2003, at 14:57 US/Central, Jean-Baptiste wrote:
Could someone explain to me what a 'BOM' is ?
Byte Order Mark. The order of the bytes in each Unicode character.
It's at the front of most Unicode files. It's either 0xfffe or
0xfeff, which mean the bytes are in DOS order or Mac order
respectively. As Unicode, the ASCII character 'A' would look like
0x0065 on a Mac and 0x6500 on DOS.
Steve's explanation obscures a few details, but is essentially
correct. For the terminally fussy:
"BOM" does indeed stand for "Byte Order Mark". The problem is that
(a) all Unicode encodings other than UTF-8 use multi-byte integers for
their code units -- for instance, UTF-16 uses 16-bit integers, and (b)
different computers use different byte orders for multi-byte integers.
[1] Therefore, if you try to send data from one part of the "world"
to another, the receiver may interpret the bytes the other way around
and get gibberish.
The solution is the BOM: it's a special "character" at the front of
the text. Its code point is 0xfeff, so if you see 0xfffe (which is
defined to be illegal, so it'll never show up under any other
circumstances), then you know you've got things the wrong way around.
--Chris Nebel
AppleScript Engineering
[1] There are only two orders anyone uses: big-endian and
little-endian. (See
<http://info.astrian.net/jargon/terms/b/big-endian.html>.) Motorola
and PowerPC use the former, Intel uses the latter. Say you've got a
2-byte integer, 0x1234. That's two bytes, 0x12 and 0x34. On a
big-endian system, the bytes show up in memory most-significant (i.e.,
the big end) first, so you get 0x1234. On a little-endian system,
it's the other way around: 0x3412.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.