gooofy
Posts: 2
Joined: Sat Mar 05, 2016 5:04 pm

Zamia Speech: Open source state of the art speech recognition [ASR] [STT]

Sun Jun 24, 2018 10:19 am

Hey guys,

for those who want to include speech recognition (speech to text, STT) capabilities into their projects but don't want to use cloud services or proprietary software our Zamia Speech project might be worth a look:

http://zamia-speech.org

we offer pre-built Kaldi ASR (http://kaldi-asr.org/) packages for Raspbian complete with pre-trained models for English and German. Everything is free, cloudless and open source.
Getting started on a Raspberry Pi 3 is easy: First, setup the zamia speech apt-source and install the packages (you need root permissions here: sudo -i):

Code: Select all

echo "deb http://goofy.zamia.org/repo-ai/raspbian/stretch/armhf/ ./" >/etc/apt/sources.list.d/zamia-ai.list
wget -qO - http://goofy.zamia.org/repo-ai/raspbian/stretch/armhf/bofh.asc | sudo apt-key add -
apt-get update
apt-get install kaldi-chain-zamia-speech-de kaldi-chain-zamia-speech-en python-kaldiasr python-nltools pulseaudio-utils pulseaudio
next, download a few speech audio recordings (or bring your own, of course) and one of our sample python scripts:

Code: Select all

wget http://goofy.zamia.org/zamia-speech/misc/demo_wavs.tgz
tar xfvz demo_wavs.tgz
wget http://goofy.zamia.org/zamia-speech/misc/kaldi_decode_wav.py
and run the demo:

Code: Select all

$ python kaldi_decode_wav.py -v demo?.wav
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model... done, took 1.473226s.
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder... done, took 0.143928s.
DEBUG:root:demo1.wav decoding took     0.37s, likelyhood: 1.863645
i cannot follow you she said 
DEBUG:root:demo2.wav decoding took     0.54s, likelyhood: 1.572326
i should like to engage just for one whole life in that 
DEBUG:root:demo3.wav decoding took     0.42s, likelyhood: 1.709773
philip knew that she was not an indian 
DEBUG:root:demo4.wav decoding took     1.06s, likelyhood: 1.715135
he also contented that better confidence was established by carrying no weapons 
interested? then check our complete documentation on github:

https://github.com/gooofy/zamia-speech#zamia-speech

here, you will find more example scripts and instructions on how to adapt the models we provide to your application domain for more accuracy and performance.

Feedback is appreciated (keep it friendly, please! :) ) as well as suggestions and contributions.

carisbrookes
Posts: 4
Joined: Wed Sep 13, 2017 2:56 pm

Re: Zamia Speech: Open source state of the art speech recognition [ASR] [STT]

Thu Aug 02, 2018 1:27 pm

Zamia looks brilliant as I currently talk to my Rpi (2b) using my Google Home which sends the resultant text via IFTTT to a Flask http server on the RPi where my own python functions can process it.
It would be great to do it all offline.
To test it out I've loaded Zamia on a clean Raspbian unstall on my 2nd RPi (3b) It is extremely slow and the va_simple.py runs once around the loop then drops out.
I ran the the demo and this shows the speed:
python kaldi_decode_wav.py -v demo?.wav
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model... done, took 29.029657s.
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder... done, took 2.262711s.
DEBUG:root:demo1.wav decoding took 11.36s, likelyhood: 2.093089
i cannot follow you she said
DEBUG:root:demo2.wav decoding took 15.28s, likelyhood: 1.709598
i should like to engage just for one whole life in that
DEBUG:root:demo3.wav decoding took 11.78s, likelyhood: 1.926260
philip knew that she was not an indian
DEBUG:root:demo4.wav decoding took 32.69s, likelyhood: 1.919863
he also contented that better confidence was established by carrying no weapons

There is nothing else running on the RPi i.e the browser was not running at the time.
Any thoughts?

Return to “Python”

Who is online

Users browsing this forum: No registered users and 18 guests