Re: specifying "text" language
Re: specifying "text" language
- Subject: Re: specifying "text" language
- From: Jean-Christophe Helary <email@hidden>
- Date: Tue, 9 Dec 2008 10:50:10 +0900
Thank you Takaaki for the idea.
I am assuming that the language is known.
My problem is that Applescript's tokenizer seems to depend on the
International preferences.
So, if I want to parse the following sentence:
"äºåã«å¿
ããã®åæ±èª¬ææ¸ãçèª ã®ä¸ãæ £ããæä½ã«åºã
¥ãæè¯ã®ç¶æ
ã§ã使ç¨ä¸ããã"
the word count/structure will differ depending on whether my
International Preference is Japanese or, say, French if the user has
set a French locale.
I need Javascript to be told which language the sentence will be so
that it can provide me with the proper tokenization.
Jean-Christophe Helary
On mardi 09 déc. 08, at 10:28, Takaaki Naganoya wrote:
How about picking up one line from unidentified text object (UTO)
and search the words by Google?
Result pages include each URL. URLs contain each country top level
domain.ã:-)
The another serious approach is .. calculate character code
distribution in each language.
Character code distribution and pick up characteristic character
(ex: Umlaut) may help you to specify which language the text is.
<distGraph_s.jpg>
On 2008/12/02, at 16:13, Jean-Christophe Helary wrote:
Is there a way to specify a text language without having to rely on
the International preferences ?
The reference for "word" indicates:
A continuous series of characters, with word elements parsed
according to the word-break rules set in the International
preference pane.
Because the rules for parsing words are thus under user control,
your scripts should not count on a deterministic text parsing of
words.
But what if I need to parse a multilingual text, or a foreign text
in a different environment, for ex, Japanese in a French
"International" setting ?
Are there programatic ways to accomplish that ?
Jean-Christophe Helary
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden
)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden
--
Takaaki Naganoya
Piyomaru Software
http://piyo.piyocast.com
email@hidden
PiyoCast Web (Podcasting with Music!)
http://www.piyocast.com
Free AppleScript Library "AS Hole"
http://www.piyocast.com/as/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden