Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Converting nonprintable Ascii Value
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Converting nonprintable Ascii Value



Mike Jackson wrote:
| The problem is that the 0x80 is ALREADY decomposed into a pair of bytes

That one sentence contains at least three errors:

First, Strings do not contain bytes. They contain characters. Bytes are eight bits wide. Characters are sixteen bits wide. Bytes are not characters. Characters are not bytes. Until you stop confusing the two, you're not going to understand, much less solve, your problem.

Second, the String contains the problem character decomposed into two *characters* (not bytes): an *accent character* (the diaresis, \u0308) and a *base character* (the "A", \u0041), according to the standard Unicode character representation rules.

Third, you wouldn't get 0x80 by recomposing the accent and base characters. You'd get \u00C4, LATIN CAPITAL LETTER A WITH DIARESIS. The 0x80 is an artifact of converting the Unicode characters into whatever eight-bit encoding is being used. (It's probably not UTF-8. The UTF-8 encoding of \u0308 would be 0xCC 0x88; that of \u00C4 would be 0xC3 0x84. Neither of these contains 0x80, so if the byte encoding contains 0x80, either there's another non-ASCII character, or some encoding other than UTF-8 is being used.)

The way to eliminate the "extra" character is to either run the Unicode composing algorithm on the String (not byte array), transforming the decomposed character pair into the Unicode character LATIN CAPITAL LETTER A WITH DIARESIS, or to change the name of the file to eliminate the accent mark.

The only way to guarantee in a cross-platform way that there will be only one byte per character in file names is to restrict the characters used in file names to the ASCII characters from 0 to 127. (Anything above 127 isn't platform-independent.)

Glen Fisher
_______________________________________________
java-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/java-dev
Do not post admin requests to the list. They will be ignored.


References: 
 >Re: Converting nonprintable Ascii Value (From: Werner Randelshofer <email@hidden>)
 >Re: Converting nonprintable Ascii Value (From: Mike Jackson <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.