I already have posted here asking how could I create a script that would determine what language a given particular document is in. Since I wasn't able to chew on some of your recommendations because of overly complex nature at that point for me, I decided to find a workaround on my own using tools I'm already familiar with rather than going with a very sophisticated AppleScriptObjC language.
So now I'm painstakingly struggling to build a script that on evaluation of number of characters belonging to the ABC of that language setting off supposed characters of another language if found would act accordingly. The logic is based on the assumption that if a document's language is A then count of characters of language A should considerably surpass the number of characters of language B. In the below script Greek characters are given only as an example characters. In early stages of working out my script I thought in terms of entire text but then I switched my tactics to paragraph-to-paragraph basis. At some moment I thought I caught luck by the tail by writing the script below but every time I run it having extracted Non-Greek text (to have it tested under various conditions, including those where there are no Greek letters just to return me the number of non-Greek symbols) it either gives error "Variable GreekCharactersCount not defined" or just return the values of variables I set the variables to (setting to 0 or to missing value) as if the script wouldn't exist there at all. Why? It looks so polished to me, so logically determined and very well grounded and thought out. I commented out certain twists of my script so that you could adjust some parts according to your requirements should you wish to give it a shot. What am I missing?
tell application "TextEdit"
activate
open file "(*colon delimited path-string to a text (.txt) file*)"
set MyDoc to text of document named "(*name of the file*)"
set visible of window 1 to false --optional
set theParagraphs to paragraphs of MyDoc
set GreekCharacterSet to {"α", "β", "γ", "δ", "ε", "ζ", "η", "θ", "ι", "κ", "λ", "μ", "ν", "ξ", "ό", "о", "π", "ρ", "ς", "Ύ", "φ", "χ", "ψ", "ω"}
set GreekYes to {} --Greek characters of the text go there
set GreekNo to {} --Non-Greek characters of the text go there
repeat with i from 1 to (count theParagraphs) --1st loop. Selects every paragraph of the text of the document.
set ParagraphCharacters to characters of paragraph i of MyDoc
repeat with item_ref in GreekCharacterSet --2nd loop. Compares every characters of GreekCharacterSet to every characters of selected paragraph of the text of the document
if contents of item_ref is in ParagraphCharacters then --1st situation. Greek characters are in this paragraph. If Greek characters are not in this paragraph then goes ahead to the next targeted paragraph ignoring all subsequent conditional statements.
repeat with j from 1 to (count ParagraphCharacters)
if contents of item_ref = contents of item j of ParagraphCharacters then --3rd loop, 1st sub situation: on finding those characters places them inside GreekYes variable and counts its items
set end of GreekYes to contents of item j of ParagraphCharacters
set GreekCharactersCount to (count GreekYes)
else
if contents of item j of ParagraphCharacters is not in {" ", "(", ")", "-", "+", "=", "|", "{", "}", "°", "[", "]", "^", "/", "\\", "·", "$", "€", "‡", "±", "*", "<", ">", "≥", "≤", "≠", ":", ";", ".", ",", "⁄", "‹", "›", "—", "_", "?", "!", "«", "»", "", linefeed, "1", "2", "3", "4", "5", "6", "7", "8", "9", "0"} then -- 3rd loop, 2nd sub situation: if Greek characters are not in these paragraphs then the ignores special characters of this list and stores all other (supposed Non-Greek characters) in
set end of GreekNo to contents of item j of ParagraphCharacters GreekNo variable which then is counted.
set NonGreekCharactersCount to (count GreekNo)
end if
end if
end repeat
end if
end repeat
end repeat
end tell
{GreekYes: GreekCharactersCount, GreekNo: NonGreekCharactersCount}