Speech Recognition with Pocketsphinx


47 posts   Page 1 of 2   1, 2
by observing » Tue Jun 26, 2012 11:25 pm
I'm currently working on a project that requires speech recognition. Though I haven't completed it yet, I thought some of you might be interested in the fairly detailed steps I took to install Pocketsphinx on the RPi. The steps start with a clean image of Debian wheezy and end with continuous speech testing. You can find it at https://sites.google.com/site/observing ... spberry-pi
Posts: 43
Joined: Mon Feb 27, 2012 12:18 pm
by aonsquared » Thu Jun 28, 2012 9:01 pm
Thanks for this! Your instructions are quite good and clear. It at least gives an alternative to Julius (which I've written tutorials for in http://www.aonsquared.co.uk/raspi_voice_control) for speech recognition on the Raspberry Pi. Great possibilities!
Posts: 21
Joined: Sat Jan 28, 2012 6:40 pm
Location: Bristol, UK
by observing » Fri Jun 29, 2012 12:56 am
Actually, I used Julius for a windows version of my project (a universal remote control operated by speech commands) and liked it a lot. However, I think Pocketsphinx might be better suited for small systems like the RPi, so I'm giving it a try. The big hurdle is getting the audio input clean enough to get decent accuracy. The USB audio adapter I'm using does a great job under windows, so I know it's not the problem. I'm going to take another shot at using a Bluetooth headset and see if I can get better results.
Posts: 43
Joined: Mon Feb 27, 2012 12:18 pm
by ax206geek » Fri Jun 29, 2012 2:01 pm
I like to suggest another alternative: Google Speech API and off-load the recognition to google, here's a quick example:

Code: Select all
arecord -D plughw:1,0 -f cd -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o out.flac; wget -O - -o /dev/null --post-file out.flac --header="Content-Type: audio/x-flac; rate=16000" http://www.google.com/speech-api/v1/recognize?lang=en | sed -e 's/[{}]/''/g'


This came from here.

There's a C library libsprec but I'm still struggling with the dependencies on the r-pi.

I would like to hear from others on the accuracy of this approach.
Posts: 52
Joined: Sat Jun 16, 2012 2:55 pm
by ax206geek » Sat Jun 30, 2012 2:35 pm
ax206geek wrote:I like to suggest another alternative: Google Speech API and off-load the recognition to google, here's a quick example:

I cleaned it up a little and posted it here

This is now good enough for executing simple voice commands like hostname and halt
Code: Select all
eval `gs_sp_to_txt.sh`


I'll order my r-pi to halt with it :D
Posts: 52
Joined: Sat Jun 16, 2012 2:55 pm
by mikerr » Tue Jul 24, 2012 10:39 am
Hmm got pocketsphinx compiled ok using the above instructions, but had ZERO success with word recognition.

I haven't got it to recognize any single word correctly, and the words it produces don't sound like what was said at all ?!
It can tell the difference between no sound and speech, but doesn't recognise anything.

Recording a .wav from the mic (webcam pro 9000) produces a reasonable sound file, so I wonder if something else is wrong.
Can pocketsphinx run from a .wav file, and are there known good test .wav files out there ?

So I tried using google's online recognizer with the same setup - and it's near 100% accuracy !
So my mic setup seems ok.

Of course google only works with an internet connection, but it gets me up and running, thanks ax206geek
Got a Pi Camera? View it in my android app - Raspicam Remote ! No software required on the pi
User avatar
Posts: 1294
Joined: Thu Jan 12, 2012 12:46 pm
Location: NorthWest, UK
by recantha » Tue Jul 24, 2012 10:56 am
How are you actually getting the speech onto the Pi in the first place? USB microphone or something?
My Raspberry Pi blog with all my latest projects and links to articles
http://raspberrypipod.blogspot.com. +++ Current project: PiPodTricorder - lots of sensors, lots of mini-displays, breadboarding, bit of programming.
Posts: 209
Joined: Mon Jun 25, 2012 10:41 am
by mikerr » Tue Jul 24, 2012 11:17 am
I've tried a cheap (£3 !) usb sound card and old mic - which was just too noisy.

Best results with the Logitech Pro 9000 webcam, which shows up as a mic in raspbian,
- not cheap, but I had it already for skype.

Some webcams will allow the mic to be used as a separate device in linux, but not all.
Got a Pi Camera? View it in my android app - Raspicam Remote ! No software required on the pi
User avatar
Posts: 1294
Joined: Thu Jan 12, 2012 12:46 pm
Location: NorthWest, UK
by observing » Tue Jul 24, 2012 1:22 pm
@mikerr,
When I ran pocketsphinx_continuous using the setup I described, it did recognize various words in its dictionary despite the poor sound quality. If you give it a word not in its dictionary, it will always pick what it thinks is the closest match regardless of how far off it is.

There is an example of a pocketsphinx application that can process an audio file at this link: http://cmusphinx.sourceforge.net/wiki/t ... cketsphinx. I would expect good results from it because I think the real problem with using USB audio input has to do with USB packet loss (see viewtopic.php?f=28&t=5249)

I wish I could use the solution that ax206geek proposed, but my application isn't intended to be used with an internet connection.
Posts: 43
Joined: Mon Feb 27, 2012 12:18 pm
by biscuitehh » Sat Sep 01, 2012 12:44 am
I'm going to build pocketsphinx tonight and try and get some love with it, wish me luck! I'll report back any success/failure... love this little computer
Posts: 3
Joined: Fri Jun 08, 2012 3:34 am
by recantha » Sat Sep 01, 2012 7:05 am
Good luck, Biscuit! Looking fwd to your results!
My Raspberry Pi blog with all my latest projects and links to articles
http://raspberrypipod.blogspot.com. +++ Current project: PiPodTricorder - lots of sensors, lots of mini-displays, breadboarding, bit of programming.
Posts: 209
Joined: Mon Jun 25, 2012 10:41 am
by biscuitehh » Mon Sep 03, 2012 2:47 am
Good news! I've successfully built pocketsphinx (it wasn't that bad) on the Rasp Pi (using the wheezy distro). I'm trying to figure out how to build a continuous audio input setup to try it out, but i'll post the specifics in a bit (it's labor day for me tomorrow, another 24 hours to geek out before my new job!)
Posts: 3
Joined: Fri Jun 08, 2012 3:34 am
by kavi96 » Mon Sep 03, 2012 2:54 pm
Your instructions seem to need a usb audio card with both input and output. How would I set up the pi to use the built in audio output but have a usb microphone as input?
Posts: 13
Joined: Tue Aug 28, 2012 10:28 am
by observing » Tue Sep 04, 2012 9:37 pm
Just a quick update. Since I last posted in this thread, I have seen greatly improved results by setting the sampling rate to 48000 Hz. This might be specific to the chipset of my audio adapter (C-Media), but note that the default sampling rate is 8000 Hz.
Posts: 43
Joined: Mon Feb 27, 2012 12:18 pm
by letdarri » Fri Nov 02, 2012 5:19 am
Hi,
I do not know python very well but I managed to control the GPIO outputs to turn on the LEDs.
I would now be able to control the outputs GPIO with voice commands, possibly in Italian, with python.
  I could use a simple example.

thanks
Posts: 1
Joined: Fri Nov 02, 2012 5:09 am
by snowhite » Sun Dec 09, 2012 3:22 am
mikerr wrote:So I tried using google's online recognizer with the same setup - and it's near 100% accuracy !
So my mic setup seems ok.


What is the link for Google online recognizer? Do you mean on an Android device?

Thanks
Posts: 27
Joined: Mon Aug 20, 2012 1:35 am
by Maximus5684 » Sat Dec 15, 2012 8:46 am
Just a tip: If you install the python development headers (python-dev and/or python2.7-dev) before building sphinxbase and pocketsphinx, the Python API module will be installed by default. This way, you can simply:

import pocketsphinx as ps
speechRec = ps.Decoder()
wavFile = file(wavfile,'rb')
wavFile.seek(44)
speechRec.decode_raw(wavFile)
result = speechRec.get_hyp()
print result[0]
Posts: 22
Joined: Sat Dec 15, 2012 8:38 am
by mikerr » Mon Dec 17, 2012 2:39 pm
snowhite wrote:
mikerr wrote:So I tried using google's online recognizer with the same setup - and it's near 100% accuracy !
So my mic setup seems ok.

What is the link for Google online recognizer? Do you mean on an Android device?

see ax206geek's posts above
Got a Pi Camera? View it in my android app - Raspicam Remote ! No software required on the pi
User avatar
Posts: 1294
Joined: Thu Jan 12, 2012 12:46 pm
Location: NorthWest, UK
by Defiant » Mon Dec 17, 2012 7:41 pm
Try the gstreamer api: http://cmusphinx.sourceforge.net/wiki/gstreamer

I'm successfully using it on the pandaboard, should do as well on rpi.
Posts: 150
Joined: Tue Oct 30, 2012 6:17 pm
Location: Hamburg, Germany
by tommekevda » Sat Jan 05, 2013 12:10 pm
I'm currently working on my own home automation project and want to include some voice recognition to be able to control the lights and tv.
all this is being handled with a java tool that runs on the pi. My arduino is connected to the pi as well to do the actual turning on an off of the lights.

But i find the speech recognition on the pi rather slow and fairly inaccurate (english isn't my main language), therefore i find the solution to offload the recognition to google rather interesting. It also gives a much better result.
but i'm sure google won't be happy that i'm sending them a voice sample every 5 seconds.
Therefore i was thinking about using the pocketsphinx solution anyway but only to capture 1 keyword (computer). my tool could then switch from pocketsphinx to google recognition if the keyword is recognized by pocketsphinx. This brings me back to my original problem. i want only 1 keyword to be recognized, which is "computer". although i'm certainly not a language expert but i feel that the pure size of the dictionary might be the limiting factor. i've tried to make my own dictionary but the software required to do so times out upon download.
Can i change stuff in the already available model? of do you guys have another solution for just capturing keywords?
Posts: 12
Joined: Mon Jan 30, 2012 4:46 pm
by cyrano » Sat Jan 05, 2013 1:57 pm
Could Voice recognition in a Java script help you?

Have a look here:
http://www.aonsquared.co.uk/raspi_voice_control
http://www.aonsquared.co.uk/node/30
User avatar
Posts: 513
Joined: Wed Dec 05, 2012 11:48 pm
Location: Belgium
by tommekevda » Sat Jan 05, 2013 3:05 pm
Voice recognition in javascript isn't the solution for me. I've found some java code though to handle some speech recognition but the library isnt from oracle, they let 3rd party developers provide the library.
Before i'm going in that direction i just wanted to make sure if there isnt already a working binary available which i could talk to, since i don't know where the java lib is going to take me :).

All this work, just to be able to say "computer" to my pi :)
Posts: 12
Joined: Mon Jan 30, 2012 4:46 pm
by cyrano » Sat Jan 05, 2013 3:14 pm
tommekevda wrote:but i'm sure google won't be happy that i'm sending them a voice sample every 5 seconds.


Could it be you mean Apple? Check SiriProxy?

I've had the same thought about Apple's Siri. Onde day, Apple might block other users and just cater for iOS users. But up until now, they haven't done so. They probably don't care...
User avatar
Posts: 513
Joined: Wed Dec 05, 2012 11:48 pm
Location: Belgium
by tommekevda » Sat Jan 05, 2013 3:21 pm
I havent checked siriproxy yet because i didnt know that existed.
But i ment google.

Code: Select all
CODE: SELECT ALL
arecord -D plughw:1,0 -f cd -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o out.flac; wget -O - -o /dev/null --post-file out.flac --header="Content-Type: audio/x-flac; rate=16000" http://www.google.com/speech-api/v1/recognize?lang=en | sed -e 's/[{}]/''/g'
Posts: 12
Joined: Mon Jan 30, 2012 4:46 pm
by cyrano » Sat Jan 05, 2013 4:24 pm
tommekevda wrote:I havent checked siriproxy yet because i didnt know that existed.


Funny. I didn't know about the Google Voice API. Even Google doesn't seem to know it. :lol:

Python and GV?
http://code.google.com/p/pygooglevoice/
User avatar
Posts: 513
Joined: Wed Dec 05, 2012 11:48 pm
Location: Belgium