Second I was thinking each String getBytes() would be a direct hex
translation of the Unicode value, which I have verified it does not
seem to be. I'm still not completely sure that makes what I was
suggesting complete nonsense.
I don't know what you mean by "a direct hex translation" of a
Unicode value.
If the Unicode value is u0439 I might think the Unicode would have a
0x0439. Or if the escaped value is taken as decimal a value of 439 =
0x01b7. But I tested and neither was the case.
Does java String.getBytes("\u0439") always produce the same hex value
- whatever it might be?
It doesn't: what it returns depends on what the default encoding
is. Read
The Fine Manual section describing String.getBytes().
Or specified encoding. I was thinking providing an encoding on the
constructor didn't do any actual conversion to the bytes. But I did
check the javadoc and sometimes it sounds like it does.
And I suspect you meant this:
"\u0439".getBytes()
Yeah.
True. But you'd have to define SuperHybridLatinCyrllicCharSet as
the context.
There is nothing magical about charsets or encodings. Unicode is an
encoding. Even binary is an encoding. Contrast 2's-complement,
1's-complement, sign-magnitude, and BCD; they are all "binary", but
arithmetic is different in each one.
A bit-pattern only has meaning given an interpretation, aka an
encoding
that tells you what the patterns mean.
Remembering back to when I did a little of this I think it may be
more difficult than indicated or you would also need a SuperHybrid
font. I think the byte converters sometimes fixed offset subtract
back to a font mapping. From my understanding you are not going to
want a single font for every possible Unicode glyph.
So composites probably are mostly nonsense.
However if it always produces the same hex bytes then you could vary
the encoding and it _might_ sense.
I try not to write code that only _might_ make sense.
Sometimes thats why you test, which is what I was doing.
Basically rightly or wrongly you are claiming the encoding handles
the string.
No I'm not, because it doesn't. Read The Fine Source and see for
yourself.
Actually I did check the javadoc and I thought it went further then I
thought and did mean some byte conversion at least might happen.
Sorry about the _might_ word again. But I'm already conceding there
are definitely encodings that make no sense for some unicode strings,
like latin for cyrillic.
Specifying the encoding does not itself do any byte
conversions in the String constructor.
False. The encoding, either a named one or the default, leads to a
converter object. That object converts the bytes you give it into
Java
chars, which are a primitive type with a 16-bit size and Unicode
encoding.
Those 16-bit Java chars are then arranged in an array or sequence,
with
which it creates a String.
OK, that is what the javadoc said.
I couldn't comment on the IDEA of what your code was trying to do,
because
I couldn't tell what that idea was. All I could comment on was
whether it
was doing the correct things, given the context. And the answer to
that
was "No."
Uh ok, I guess I should comment. I thought that the code was trying
some of the discussed string values with a set of the possibly
involved character sets would be sort of clear.
Not sure on the PrintStream stuff. You're probably right that it all
depends on file.encoding.
Again, RTFM, or even RTFS included with Apple's Java Developer
downloads.
Or decompile it using 'javap' on PrintStream.
Actually this one I had second thoughts on that might rate additional
testing if I was curious enough. For Terminal what you say regarding
PrintStream would be pertinent. My original test appeared to never
work for that, even when System Preferences should of set things to
MacCyrillic. It might be that I had Terminal running before changing
settings and it doesn't refresh or something like that. When going to
the Text component I am of course redirecting I/O and the
PrintStream.println() is probably actually getting switched to a
widget.appendText(). The application I was starting and stopping
after changing preferences.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden