Re: Spellcheck a list
Re: Spellcheck a list
- Subject: Re: Spellcheck a list
- From: Graff <email@hidden>
- Date: Mon, 04 Oct 2004 18:37:38 -0400
On Oct 4, 2004, at 2:08 PM, Paul Berkowitz wrote:
On 10/4/04 9:07 AM, "Graff" <email@hidden> wrote:
On Oct 4, 2004, at 9:06 AM, Jan Steinman wrote:
From: Graff <email@hidden>
you can use the "grep" command:
...
do shell script "grep " & quotedWord & " /usr/share/dict/words"
With a bit of regular expressions thrown in you could have a
decently
powerful spell checker.
If you aren't going to use regular expressions, then fgrep(1) is much
faster than grep, although the difference is hardly noticeable unless
you're really crunching a huge file. ("f" as is "fixed strings" as
opposed to regular expressions.)
I did use a regular expression. I added the start-of-line marker "^"
and the end-of-line marker "$" to the string before I quoted it. I
did
this so that I would get only unique matches rather than any line that
contained the string. For example:
using:
grep '^youth$' /usr/share/dict/words
gives me:
youth
using:
fgrep 'youth' /usr/share/dict/words
gives me:
overyouthful
<snip>
youthy
youthy? youthily? youthwort? youthen? unyouthfully? preyouthful?
What nonsense is all this? These aren't real words. They're stupid
guesses
made by a computer which has been told how to add suffixes and
prefixes to
root words to make putative parts of speech which _might_ be words, but
patently are not so for most root words - including "youth".
This is not something which should be used by anyone looking for a
proper
spellchecker. It's a joke. Using grep looking for specific words with
it
should be OK.
Right, that's how I was using it. I had the script set up to use grep
to _only_ find the exact match. If you use fgrep to simply search for
a string then you get all that other nonsense.
By the way, here's the README file that accompanies that word file:
# $NetBSD: README,v 1.2 1997/03/26 07:14:32 mikel Exp $
# @(#)README 8.1 (Berkeley) 6/5/93
WEB ---- (introduction provided by jaw@riacs) -------------------------
Welcome to web2 (Webster's Second International) all 234,936 words
worth.
The 1934 copyright has elapsed, according to the supplier. The
supplemental 'web2a' list contains hyphenated terms as well as assorted
noun and adverbial phrases. The wordlist makes a dandy 'grep' victim.
-- James A. Woods {ihnp4,hplabs}!ames!jaw (or jaw@riacs)
So those are all words according to Webster's Second International
dictionary. The wordlist is not just some computer-generated list of
words,
However, I also felt that this was a bit dodgy which is why I also gave
a link to aspell, a spelling tool which can be installed through Fink:
<http://fink.sourceforge.net/pdb/package.php/aspell>
- Graff
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden