On Aug 7, 2008, at 9:49 AM, Christopher Nebel wrote: On Aug 6, 2008, at 3:49 PM, Philip Aker wrote: On 2008-08-06, at 11:35:40 , Christopher Nebel wrote:
The issue of having : be a word break character in the main rules and pushing the Swedish/Finnish behavior into a locale was raised with the Unicode Technical Committee, and the decision was that wouldn't be the correct behavior. The en_US_POSIX locale behavior was then added specifically at Apple's request, that being considered the correct locale for behavior specific to programmers.
This is my locale:
# env | grep LANG
LANG=en_US.UTF-8
If I grab from CFLocale, then it shows en_US_POSIX.
I run this in Script Editor:
words of "1e:c2"
-->{"1e:c2"}
Please explain.
There isn't just one "locale" preference that controls everything, there are several to control various aspects. The relevant one in this case is the text break locale, which you can get from the global AppleTextBreakLocale preference -- if it's absent, the system uses AppleLanguages[0]. (Normal users would use in System Preferences > International > Language > Word Break.) AppleLocale and the LANG environment variable have no effect on text breaks.
Ah, I see. With AppleTextBreakLocale POSIX-ly set, I get:
--> {"1e", "c2"}
Here are the preference keys for the various International Preferences settings (all technically subject to change, though that's unlikely):
Language preference order: AppleLanguages Order for sorted lists: AppleCollationOrder if present, else AppleLanguages[0]
[~]# defaults read -g | grep Apple AppleCollationOrder = root; AppleEnableMenuBarTransparency = 0; AppleKeyboardUIMode = 0; AppleLanguages = ("en-US", en, ja, de, "zh-Hans", "zh-Hant", ko) AppleLocale = "en_US_POSIX"; AppleTextBreakLocale = "en_US_POSIX"; [~]#
For POSIX fit, is this 'root' or 'en' (Standard or English)? Text break behavior: AppleTextBreakLocale if present, else AppleLanguages[0] Region/Calendar/Currency: AppleLocale (get the current locale, then get its ID). This is the current NSLocale/CFLocale. Format customizations: a variety of separate preferences
These are all totally independent of one another. Setting AppleLocale should only change what's displayed in the Formats pane. You might have to relaunch System Preferences since setting it via the command line doesn't send any kind of notification.
So, my remaining question (not for you I think) is why, when selecting en_US_POSIX locale in System Preferences, the word break setting (and every thing POSIX-ly associated) wouldn't be changed to match? IOW, I would have to manually alter the ATBL _after_ setting the locale if I really wanted that non-matching behavior. I understand that if I was to do it at the command line, then I would have to do the two-step. But System Preferences is supposed to behave like Mac users expect -- put the bread in the toaster and push down -- oops, only the right hand side went down. Damn, have to go to the other side of the toaster and push the left side down separately. :-(
Philip Aker
Democracy: Two wolves and a sheep voting on lunch.
|