Re: Chinese Characters
Re: Chinese Characters
- Subject: Re: Chinese Characters
- From: Philip Aker <email@hidden>
- Date: Wed, 5 Aug 2009 13:13:45 -0700
On 2009-08-05, at 12:18:19, Simon Topliss wrote: I have some text and want to know if a character is a Chinese character. Using AppleScript, I know that the id of the character "用" is 29992. http://www.unicode.org/Public/UNIDATA/Blocks.txt provides a list of ranges for each Unicode 'block'. So, how should I test if the decimal number of character 29992 is in a range of hex numbers to determine what its 'block' is? I'm suppose that I need to create a lookup table of the 'Blocks.txt' in AppleScript, with column 1 being the 'from' range, column 2 being the 'to' range and column 3 being the 'block name'. What I still can't work out how to do is to either convert the decimal 29992 to a hex number and (somehow) test if it's in a hex range, or convert the 'from' and 'to' columns to a decimal value. I guess the latter would be easier for the comparison.
CJK Unified Ideographs seems to be 0x4E00 - 0x9FFF however I'm not sure if there are any subranges which should be accounted for.
set char_id to 29992 do shell script "tclsh <<< 'set test " & char_id & ";if {$test <= 0x9FFF && $test >= 0x4E00} then {puts 1;} else {puts 0;}'"
echo email@hidden@nl | tr a-z@. p-za-o.@ Democracy: Two wolves and a sheep voting on lunch.
|
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden