Re: Chinese Characters
Re: Chinese Characters
- Subject: Re: Chinese Characters
- From: "Mark J. Reed" <email@hidden>
- Date: Wed, 5 Aug 2009 16:05:13 -0400
On Wed, Aug 5, 2009 at 3:18 PM, Simon
Topliss<email@hidden> wrote:
> Hello all,
>
> I have some text and want to know if a character is a Chinese character.
>
> Using AppleScript, I know that the id of the character "用" is 29992.
Converting decimal to hex and vice versa is pretty straightforward; I
use dc(1), but there are other shell tools that would do the job as
well, or you could code up the conversion manually using AppleScript
to do the math. Here are some handlers that use dc:
on fromHex(someValue)
do shell script " dc <<<' 16i " & someValue & "p' "
end fromHex
on toHex(someValue)
do shell script " dc <<<' " & someValue & " 16op' "
end toHex
As far as actually checking the ranges, something like this will work
- it's not speedy, though initially loading the block data is the
slowest part (so e.g. making the findBlock use a binary search
wouldn't be a big win):
property UnicodeBlocks: {}
on fromHex(someValue)
do shell script " dc <<<' 16i " & someValue & "p' "
end fromHex
on toHex(someValue)
do shell script " dc <<<' " & someValue & " 16op' "
end toHex
on findBlock(someCharacter)
if (count UnicodeBlocks) is 0 then
repeat with aLine in (paragraphs of (read POSIX file
"/System/Library/Perl/5.8.8/unicore/Blocks.txt"))
if length of aLine is not 0 and text 1 of aLine is not "#" then
set text item delimiters to "; "
set blockRange to text item 1 of aLine
set blockDescription to text item 2 of aLine
set text item delimiters to ".."
set blockStart to fromHex(text item 1 of blockRange)
set blockEnd to fromHex(text item 2 of blockRange)
set end of UnicodeBlocks to {blockStart, blockEnd,
blockDescription}
end
end repeat
end if
set someCharacterId to id of someCharacter
repeat with aBlock in UnicodeBlocks
if someCharacterId >= item 1 of aBlock and someCharacterId <=
item 2 of aBlock
return item 3 of aBlock
end if
end repeat
end findBlock
findBlock("用")
==> CJK Unified Ideographs
==>
--
Mark J. Reed <email@hidden>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden