Re: Unicode and languages
Re: Unicode and languages
- Subject: Re: Unicode and languages
- From: email@hidden
- Date: Thu, 8 Apr 2004 18:49:12 +0900
Why exactly do you want to do this? Part of the point of Unicode is
that all languages are treated equally -- you just preserve
everything. I suppose you could check for the presence of Japanese
characters, but their presence doesn't guarantee that the text is
Japanese (some strings are valid Japanese and Chinese), nor does their
absence guarantee that the text is English. (Exactly how you'd do
this in the first place is an interesting question, seeing as how
there's no "Unicode number" command.)
If the only two possible options for his text are japanese or english,
it is likely that all cjk characters will be japanese and not chinese
or korean. besides, english is written in roman alphabet and the ascii
character set that covers it is placed in a similar top position in
unicode. so I suppose it would not be too hard to figure out that any
string that can be mapped to ascii is english while any string that
can't be is japanese. Another possibility would be to check the
characters one by one since all cjk characters have a defined position
in unicode (a unique code :). Any string in the CJK "plane" could be
considered as CJK. Since Japanese makes _very_ little use of alphabetic
characters, it is very likely that only CJK plane members would be
Japanese.
Of course, implementing that in AS is a different matter...
If you want to know if the text can be losslessly encoded in MacRoman
vs. MacJapanese, then that's a different question; in that case I'd
suggest attempting to pipe the text through iconv(1) -c and seeing if
it errors.
How would you segment the string before conversion ? By doing that you
only know that if any given string converted to MacRoman fails then it
must contain something else somewhere, but you can't determine where,
can you ?
JC Helary
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.