Posts: 2
Joined: Sat Mar 05, 2016 5:04 pm

Zamia Speech: Open source state of the art speech recognition [ASR] [STT]

Sun Jun 24, 2018 10:19 am

Hey guys,

for those who want to include speech recognition (speech to text, STT) capabilities into their projects but don't want to use cloud services or proprietary software our Zamia Speech project might be worth a look:

we offer pre-built Kaldi ASR ( packages for Raspbian complete with pre-trained models for English and German. Everything is free, cloudless and open source.
Getting started on a Raspberry Pi 3 is easy: First, setup the zamia speech apt-source and install the packages (you need root permissions here: sudo -i):

Code: Select all

echo "deb ./" >/etc/apt/sources.list.d/zamia-ai.list
wget -qO - | sudo apt-key add -
apt-get update
apt-get install kaldi-chain-zamia-speech-de kaldi-chain-zamia-speech-en python-kaldiasr python-nltools pulseaudio-utils pulseaudio
next, download a few speech audio recordings (or bring your own, of course) and one of our sample python scripts:

Code: Select all

tar xfvz demo_wavs.tgz
and run the demo:

Code: Select all

$ python -v demo?.wav
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp loading model... done, took 1.473226s.
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder...
DEBUG:root:/opt/kaldi/model/kaldi-generic-en-tdnn_sp creating decoder... done, took 0.143928s.
DEBUG:root:demo1.wav decoding took     0.37s, likelyhood: 1.863645
i cannot follow you she said 
DEBUG:root:demo2.wav decoding took     0.54s, likelyhood: 1.572326
i should like to engage just for one whole life in that 
DEBUG:root:demo3.wav decoding took     0.42s, likelyhood: 1.709773
philip knew that she was not an indian 
DEBUG:root:demo4.wav decoding took     1.06s, likelyhood: 1.715135
he also contented that better confidence was established by carrying no weapons 
interested? then check our complete documentation on github:

here, you will find more example scripts and instructions on how to adapt the models we provide to your application domain for more accuracy and performance.

Feedback is appreciated (keep it friendly, please! :) ) as well as suggestions and contributions.

Return to “Python”

Who is online

Users browsing this forum: No registered users and 18 guests