Michael Hall wrote:
>If the Unicode value is u0439 I might think the Unicode would have a
>0x0439. Or if the escaped value is taken as decimal a value of 439 =
>0x01b7. But I tested and neither was the case.
Most likely because you were testing the wrong thing.
If you test only the bytes returned from getBytes() or getBytes(String),
then the original Unicode is gone. The returned bytes will always be
byte-encoded. If the byte-encoding does not have an easily perceptible
relationshiip to the 16-bit Unicode chars, then of course it will not be
0x0439, or anything even vaguely resembling it.
The only two encodings I know of that might be easily human-perceptible as
related to the 16-bit Unicode value are named "UnicodeBig" and
"UnicodeLittle", which emit the 16-bit Unicode codes as either big-endian
or little-endian byte-sequences. There are also "Unmarked" versions of
each which omit the byte-order indicator. UTF8 is directly related to the
16-bit Unicode values, but the bytes it produces are not easily
human-perceptible.
However, if you retrieve the actual 16-bit Java chars from the String,
either singly or in bulk, then the values of \u0439 is most definitely
0x0439. See String.charAt() and String.toCharArray().
For an example, look at the visible() function I provided earlier:
<http://lists.apple.com/archives/java-dev/2007/Apr/msg00194.html>
It calls charAt() and may convert the binary 16-bit value to hex (i.e.
characters that represent hexadecimal digits). If the Java char weren't
0x0439, there's no possible way that visible() would work. Yet it does
work.
The crucial element is that visible() starts with the original 16-bit char
and makes it visible, rather than working from a translated sequence of
bytes. It is a fundamental error to think that String.getBytes() returns
any underlying bytes; it always returns a translation.
>For Terminal what you say regarding PrintStream would be pertinent. My
>original test appeared to never work for that, even when System Preferences
>should of set things to MacCyrillic. It might be that I had Terminal
>running before changing settings and it doesn't refresh or something like
>that.
Terminal's interpretation of bytes as text is a Terminal window settings,
IIRC. As I vaguely recall, it has several settings in different sub-panes
that interact with one another. It's been a long time since I relied on
Terminal displaying glyphs. I started redirecting stdout to a file or pipe
and then dumping the data with 'hexdump', because it was unambiguous what
the bytes were, regardless of what state Terminal's window was in.
When you change primary languages in System Preferences, it presents a
little message that tells you currently running programs will not change;
you have to quit and reopen them. This message is presented in the current
primary language (i.e. the one that was primary when System Preferences
opened).
-- GG
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden
This email sent to email@hidden