Re: Sorting characters of the text - script doesn't work as expected
Re: Sorting characters of the text - script doesn't work as expected
- Subject: Re: Sorting characters of the text - script doesn't work as expected
- From: Deivy Petrescu <email@hidden>
- Date: Sun, 16 Apr 2017 09:18:33 -0400
> On Apr 15, 2017, at 19:50 , ILJA SHEBALIN <email@hidden> wrote:
>
> Hello,
> I already have posted here asking how could I create a script that would determine what language a given particular document is in. Since I wasn't able to chew on some of your recommendations because of overly complex nature at that point for me, I decided to find a workaround on my own using tools I'm already familiar with rather than going with a very sophisticated AppleScriptObjC language.
> So now I'm painstakingly struggling to build a script that on evaluation of number of characters belonging to the ABC of that language setting off supposed characters of another language if found would act accordingly. The logic is based on the assumption that if a document's language is A then count of characters of language A should considerably surpass the number of characters of language B. In the below script Greek characters are given only as an example characters. In early stages of working out my script I thought in terms of entire text but then I switched my tactics to paragraph-to-paragraph basis. At some moment I thought I caught luck by the tail by writing the script below but every time I run it having extracted Non-Greek text (to have it tested under various conditions, including those where there are no Greek letters just to return me the number of non-Greek symbols) it either gives error "Variable GreekCharactersCount not defined" or just return the values of variables I set the variables to (setting to 0 or to missing value) as if the script wouldn't exist there at all. Why? It looks so polished to me, so logically determined and very well grounded and thought out. I commented out certain twists of my script so that you could adjust some parts according to your requirements should you wish to give it a shot. What am I missing?
>
>
>
>
>
> tell application "TextEdit"
> activate
>
> open file "(*colon delimited path-string to a text (.txt) file*)"
> set MyDoc to text of document named "(*name of the file*)"
> set visible of window 1 to false --optional
>
> set theParagraphs to paragraphs of MyDoc
> set GreekCharacterSet to {"α", "β", "γ", "δ", "ε", "ζ", "η", "θ", "ι", "κ", "λ", "μ", "ν", "ξ", "ό", "о", "π", "ρ", "ς", "Ύ", "φ", "χ", "ψ", "ω"}
> set GreekYes to {} --Greek characters of the text go there
> set GreekNo to {} --Non-Greek characters of the text go there
>
> repeat with i from 1 to (count theParagraphs) --1st loop. Selects every paragraph of the text of the document.
> set ParagraphCharacters to characters of paragraph i of MyDoc
> repeat with item_ref in GreekCharacterSet --2nd loop. Compares every characters of GreekCharacterSet to every characters of selected paragraph of the text of the document
> if contents of item_ref is in ParagraphCharacters then --1st situation. Greek characters are in this paragraph. If Greek characters are not in this paragraph then goes ahead to the next targeted paragraph ignoring all subsequent conditional statements.
>
> repeat with j from 1 to (count ParagraphCharacters)
> if contents of item_ref = contents of item j of ParagraphCharacters then --3rd loop, 1st sub situation: on finding those characters places them inside GreekYes variable and counts its items
> set end of GreekYes to contents of item j of ParagraphCharacters
>
> set GreekCharactersCount to (count GreekYes)
> else
> if contents of item j of ParagraphCharacters is not in {" ", "(", ")", "-", "+", "=", "|", "{", "}", "°", "[", "]", "^", "/", "\\", "·", "$", "€", "‡", "±", "*", "<", ">", "≥", "≤", "≠", ":", ";", ".", ",", "⁄", "‹", "›", "—", "_", "?", "!", "«", "»", "", linefeed, "1", "2", "3", "4", "5", "6", "7", "8", "9", "0"} then -- 3rd loop, 2nd sub situation: if Greek characters are not in these paragraphs then the ignores special characters of this list and stores all other (supposed Non-Greek characters) in
>
> set end of GreekNo to contents of item j of ParagraphCharacters GreekNo variable which then is counted.
>
> set NonGreekCharactersCount to (count GreekNo)
> end if
> end if
> end repeat
> end if
> end repeat
> end repeat
> end tell
>
> {GreekYes: GreekCharactersCount, GreekNo: NonGreekCharactersCount}
Ilja,
if you want to find Greek Characters then don’t use the list of Greek characters but their id.
Here is an example using your script above
——————
#getting the text
set GreekCharacterSet to "open file (*colon delimited path-string to a text (.txt) file*)
set MyDoc to text of document named (*name of the file*)
set visible of window 1 to false --optional
set theParagraphs to paragraphs of MyDoc
set GreekCharacterSet to {α, β, γ, δ, ε, ζ, η, θ, ι, κ, λ, μ, ν, ξ, ό, о, π, ρ, ς, Ύ, φ, χ, ψ, ω}
set GreekYes to {} --Greek characters of the text go there
set GreekNo to {} --Non-Greek characters of the text go there”
# below is the id of all the greek characters you posted
set Greekid to {945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 972, 1086, 960, 961, 962, 910, 966, 967, 968, 969}
set GreekYes to {}
repeat with ch in GreekCharacterSet
if id of ch is in Greekid then set end of GreekYes to contents of ch
end repeat
return GreekYes
—————
if you want to do the same thing with the list of “special characters” the use this list
{32, 40, 41, 45, 43, 61, 124, 123, 125, 176, 91, 93, 94, 47, 92, 183, 36, 8364, 8225, 177, 42, 60, 62, 8805, 8804, 8800, 58, 59, 46, 44, 8260, 8249, 8250, 8212, 95, 63, 33, 171, 187, 10, 49, 50, 51, 52, 53, 54, 55, 56, 57, 48}
However, remove the empty string. It is not a character.
To get the id of a character run this
return id of “α”
Hope this helps.
Deivy Petrescu
email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden