Pocketsphinx writing to console on word detection instead of waiting for silence -
currently following command starts pocketsphinx , waits volume hit specific threshhold microphone, starts recording, , when volume drops below threshhold start processing recorded audio , output hello
if word detected.
pocketsphinx_continuous -inmic yes -keyphrase "hello" -kws_threshold 1e-30
due environments can tad noisy, waiting volume threshhold drop can take longer expected. there way pocketsphinx output recognizable words they're being spoken without need of waiting silence?
overall if have significant noise, it's better cancel hardware somehow. microphone array source separation, directed beam microphone , on should reduce noise significantly. not idea rely on pocketsphinx deal noise, it's not designed that.
if want immediate reaction on spotting, you'd better use pocketsphinx through api, not pocketsphinx_continuous, simple example in python want example:
import sys, os, pyaudio pocketsphinx.pocketsphinx import * sphinxbase.sphinxbase import * modeldir = "../../../model" # create decoder model config = decoder.default_config() config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us')) config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict')) config.set_string('-keyphrase', 'forward') config.set_float('-kws_threshold', 1e+20) p = pyaudio.pyaudio() stream = p.open(format=pyaudio.paint16, channels=1, rate=16000, input=true, frames_per_buffer=1024) stream.start_stream() # process audio chunk chunk. on keyphrase detected perform action , restart search decoder = decoder(config) decoder.start_utt() while true: buf = stream.read(1024) if buf: decoder.process_raw(buf, false, false) else: break if decoder.hyp() != none: print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) seg in decoder.seg()]) print ("detected keyphrase, restarting search") decoder.end_utt() decoder.start_utt()
Comments
Post a Comment