Re: Translating filenames for command line?
Re: Translating filenames for command line?
- Subject: Re: Translating filenames for command line?
- From: Rainer Brockerhoff <email@hidden>
- Date: Wed, 2 Jan 2002 12:37:12 -0200
At 12:52 +0100 02/01/2002, Ondra Cada wrote:
>
>>>>>> Rainer Brockerhoff (RB) wrote at Tue, 1 Jan 2002 16:42:08 -0200:
>
RB> As a side question - but probably related to this - how are we supposed
>
RB> to type an ellipsis (or any non-7-bit-ASCII character) into a @"" string?
>
>
We are not. Unless something have changed with new Apple tools, only 7-bit
>
chars are OK in @"...". You should never use non-ASCII characters in @"...",
>
since the results would be more or less random
>
>
It was even documented somewhere (can't remember just now whether in
>
Compiler docs, or Release Notes, or whatever, and don't have time to search
>
docs) for years.
I've searched the docs and couldn't even find a place where the syntax @"..." is defined... :-(
There's a reference in the release notes to using hi-bit characters in CFSTR("..."), saying that this currently supposes MacOSRoman encoding but may be changed to UTF-8 in the future. In the Darwin version of CoreFoundation, the headers says that hi-bit characters are unsupported... but in __CFStringMakeConstantString there's code to try UTF-8 first, then fall back to MacOSRoman.
In practice, it seems that the compiler simply copies the string inside @"..." to the generated code... if the source file was UTF-8, it's in UTF-8; if the source file was MacOSRoman, it's in MacOSRoman. But the runtime always assumes MacOSRoman encoding. So I changed all my source files back to MacOSRoman encoding and now it works. It seems that the Mac OS X runtime is different from the Darwin runtime in this regard.
Well, it seems a silly oversight - for an OS that boasts of being Unicode-compatible - that constant strings are restricted to English ASCII, especially when when the OS itself uses ellipsis and other hi-bit characters. Depending on further comments here, I'll file a bug to have this changed to UTF-8. Interface Builder stores text as UTF-8, so I think they should be consistent.
And I've tried to convert source files to Unicode, but the compiler definitely chokes on that - hundreds of error messages result. I wonder why Project Builder supports Unicode files at all, when the compiler doesn't?
--
Rainer Brockerhoff <email@hidden>
Belo Horizonte, Brazil
"Originality is the art of concealing your sources."
http://www.brockerhoff.net/ (updated Dec. 2001)