Excellent! Thanks Greg. You were right it was the encoding used by
native2ascii. Re-running native2ascii with "-encoding UTF-8" on my
text file produced "button.cancel=\u53d6\u308a\u6d88\u3059", which do
appear as the proper Japanese characters in the button.
Thanks for walking me through this. I've not had to do this sort of
translation before, so I appreciate the step-by-step instructions.
Mike
On Sep 8, 2008, at 10:32 PM, Greg Guerin wrote:
Mike Dougherty wrote:
Sorry, my bad. I'm used to using UTF-8 on things so I wrote that,
but meant to write unicode. I've run native2ascii on the file and
ended up with the following unicode:
Referring to a Unicode reference, \u201e is a double low quotation
mark, and \u00c7 is capital-C with cedilla. Since those are the
first two things I see in the pushbutton's image, I think it's doing
exactly what you told it to do. I think the problem occurs at some
earlier point in the production pathway.
Exactly what was the input file to native2ascii that caused it to
generate the \u201e\u00c7 etc. sequence? Are you certain it was
encoded as UTF8? Using what program?
If someone sent you that file of Japanese text, you should confirm
that what they sent you is what you received, i.e. that no data was
mangled or distorted in transit, or at any other step along the
way. See the man-page for the 'md5' command.
You should also use another tool to independently verify the actual
binary data in the file of Japanese text. See the man-page for the
'hexdump' command, for example.
Also, exactly what was the native2ascii command-line you used for
the conversion? If you omit the -encoding option, it will use the
default from the "file.encoding" property, and on US-English
configurations under Mac OS X, that's MacRoman, not UTF8. Frankly,
it looks an awful lot like utf8 bytes were interpreted as MacRoman
characters.
So to summarize, I think you need to carefully examine every step
you took in the production and processing of the file that's
supposed to contain the Japanese text. A little detective work
should nail down the point where things are going wrong, even if it
may not be clear why they're going wrong.
Also, you can look up Unicode chars and code-points at
www.unicode.org or at www.fileformat.info.
Or just google:
unicode 201e
unicode 00c7
unicode 2260
-- GG
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/java-dev/mjdougherty
%40gmail.com
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden