Re: similar strings
Re: similar strings
- Subject: Re: similar strings
- From: Adam Bell <email@hidden>
- Date: Mon, 9 Jan 2006 13:17:51 -0400
The problem as I see it is to choose the short string to match. For example, how do you choose one here:
set longString to "Now is the time for all good men to come to the aid of the party while the quick brown fox jumps over the lazy dog. How now brown cow"
set shortString to "Now is the time for a party with foxes and cows but no good men"
The answer on inspection is "Now is the time for" which happens to be at the beginning but in general...
On 1/9/06, Gary (Lists) <email@hidden> wrote:
"Feat" wrote:
> Hi, list!
>
> I need to sort strings by their degree of similarity, ignoring case. Is there
> a quick way to tell that "xxx abc zzz" and "ppp abc qqq" share the "abc"
> segment?
>
> --
> Jym Feat -- Paris FR 75018
Jym,
In general, there is an algorithm for calculating text variation (or
similarity), and that is called the Levenshtein Distance.
Luckily, this is a standard function in PHP, and that is easily accessible
via AppleScript and 'do shell script', presuming you are using OSX.
The Levenshtein Distance function is part of PHP as a standard function, so
you do not have to implement the algorithm yourself.
The basic explanation of the Lev Distance is that the value tells you the
number of character transformations needed to make one string identical to
the other.
So, the word "the" and the word "they" have a Levenshtein Distance of 1.
The strings "hello world" and "hell worm" have an LD of 3 (3 character
transformations must be applied to make string 2 identical to string 1.)
Here is a very simple example of a working usage of that in PHP, via
AppleScript. This could be cleaned up and/or condensed, but I've clipped it
from my test sheet just as it is (a bit windy).
----------------------------------------------------------------------
-- CALCULATING LEVENSHTEIN DISTANCE VIA PHP+APPLESCRIPT
set thisText to "hello"
set thatText to "hell"
-- Single Line, No Wrap!
set phpScr to "$lev=levenshtein('" & thisText & "', '" & thatText & "');echo
$lev;"
set shCmdStub to "php -r "
set sh to shCmdStub & (quoted form of phpScr)
set res to do shell script sh
--> "1"
----------------------------------------------------------------------
Note that in my simple sample, the result is returned as text, not an
integer. You can test for and then coerce that to an integer if you want to
do some other math or value comparison with the value.
I hope that helps.
--
Gary
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (
This email sent to email@hidden
--
Some minds remain open long enough for a truth to both enter and leave without processing.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden