If you want to convert \u0438\u0306 into Windows-1251, then the
first thing
you must do is normalize it back into its canonical composed form.
There
are some links to Unicode normalizers here:
<http://lists.apple.com/archives/java-dev/2007/Apr/msg00111.html>
Off the top of my head, I don't know if they work with the Cyrillic
alphabet, but it's worth a try. You could probably find more info by
googling for java unicode normalizer.
OK, just got a little bit curious about this for some reason.
It does seem a little tricky with OS X java to see correct results
here. You don't seem to ever see it correctly from Terminal, period.
You can get them right, sometimes, from java Text components.
First you have to have something other than english first in the
international system preferences to get the 'ru' locale.
I chose русский but other cyrillic looking choices might also
work.
(I don't know how the mail process will handle some of these
characters, my apologies if they don't get through correctly)
Then I set my test case like...
public class TestEnc {
I'm not sure MacRoman is actually a correct encoding. Sort of odd
that the default encodings seem to work correctly and differently
than MacRoman hard-coded. I might be mis-remembering the correct
dashes for ISO-8859-1 too for that matter.
If it is a little strange that different latin character sets would
be different in the cyrillic unicode range, if I do have them right?
But the OP might want to keep in mind he needs the international
preferences set right and it's best to avoid Terminal for testing.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden