Re: Words of Bug in Leopard (Japanese environment)
Re: Words of Bug in Leopard (Japanese environment)
- Subject: Re: Words of Bug in Leopard (Japanese environment)
- From: Christopher Nebel <email@hidden>
- Date: Wed, 6 Aug 2008 11:35:40 -0700
On Aug 5, 2008, at 11:20 PM, Christopher Nebel wrote:
On Aug 5, 2008, at 9:39 PM, Takaaki Naganoya wrote:
I tryed to get MAC address from en0 in Japanese environment. ...
--> {"ether", "00", "1e", "c2", "01", "45", "bf"} (Tiger)
--> {"ether", "00", "1e:c2", "01", "45", "bf"} (Leopard)
AppleScript just does what the international word break rules say.
You could file a bug against them -- I think this particular case
involves the treatment of ":" in Swedish and Finnish; they use it
much like English uses apostrophe -- but I'll remind you of this bit
from the revised AppleScript Language Guide:
"word: A continuous series of characters, with word elements parsed
according to the word-break rules set in the International
preference pane. Because the rules for parsing words are thus under
user control, your scripts should not count on a deterministic text
parsing of words."
Basically, don't use "word" elements to process anything other than
natural language text. In your case, you probably want text items
breaking on ":". Alternatively, you could set your word break
preference (System Preferences > International > Language > Word
break) to "English (United States, Computer)", which always treats
":" as a word break, but that would probably do horrible things to
your Japanese word breaking.
Following up a bit, I asked our International folks about this. Their
answer was that Unicode Consortium policy is to not make people use
different text break locales unless the local rules conflict with the
main rules, so support for particular languages gets folded in
whenever possible. The only language (currently) where there's a
conflict is Japanese (it conflicts with Chinese, of which there are
more speakers), hence it has its own word-break setting.
The issue of having : be a word break character in the main rules and
pushing the Swedish/Finnish behavior into a locale was raised with the
Unicode Technical Committee, and the decision was that wouldn't be the
correct behavior. The en_US_POSIX locale behavior was then added
specifically at Apple's request, that being considered the correct
locale for behavior specific to programmers.
--Chris Nebel
AppleScript Engineering
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden