Re: NSSpeechRecognizer and Speech Recognition calibration
Re: NSSpeechRecognizer and Speech Recognition calibration
- Subject: Re: NSSpeechRecognizer and Speech Recognition calibration
- From: Ricky Sharp <email@hidden>
- Date: Sat, 27 Dec 2008 07:47:08 -0600
On Dec 26, 2008, at 4:56 AM, Christopher Corbell wrote:
I'm working on an accessibility app for the visually impaired and
was hoping
to use NSSpeechRecognizer.
I've found it extremely difficult to get NSSpeechRecognizer to behave
predictably on my system. Does anyone on the list have experience
with this
class & success with the Speech Recognition system preference
panel? Any
tips or tricks?
I find that that calibration dialog for the Speech Recognition
settings
doesn't work at all for me. I'm using a pretty standard external
microphone
(built-in to a Logitech Webcam) with an intel Mac Mini. I can see
my signal
just fine and I'm speaking clearly in as accent-neutral a way as I
can, and
still none of the test sentences ever highlights. Is a headset mic
typically required, or is there some other gotcha here?
It must be your particular setup. I've been doing SR ever since it
debuted (Mac OS 8.x days) and have not had trouble when words/phrases
are unique enough (as yours clearly are).
When I give NSSpeechRecognizer a very small and unambiguous command
set, I
find it badly misses the mark. For example I might have "Play",
"Next", and
"Stop" in my command set, and it will interpret "Next" as "Play",
but it
will never interpret "Play" as a command - pretty unusable, I'm
hoping it's
just a calibration issue.
Since the calibration dialog isn't working for you, it's not
surprising that it's getting your phrases confused. Make sure to get
your setup working in the calibration area first.
One last note - is there any way to do proper dictation with this
class or
will it only recognize the preset command list you give it? I'm
thinking
for example of prompting for a file name to save to, or a term to
search on
- it would be nice to have true dictation, otherwise I'll resort to
providing an alphabet as a command set so the user can spell it out
(assuming I can get that to work).
No. And, you definitely do _not_ want to add letters to your language
model. English letters have too many cases where sounds are extremely
similar: 'B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z' for probably the
largest set.
When I worked on numeric input, I had to offer two modes (two
different speech models driven by user-preference). For example,
'sixteen' and 'sixty' were often confused. This got better over time
though, but still not 100%. For users that had trouble, they could
switch to the other model in which they needed to speak individual
digits instead: 'one six' and 'six zero'. Now the phrases were unique
enough to remove any confusion.
You really only have two options: (1) The user has a 3rd-party
dictation solution or (2) your solution uses words/phrases for letter
input. For example the military alphabet (alpha, bravo, charlie,
etc.) which was designed to work over very low-quality audio situations.
___________________________________________________________
Ricky A. Sharp mailto:email@hidden
Instant Interactive(tm) http://www.instantinteractive.com
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden