drcaptain
Posts: 12
Joined: Tue Nov 11, 2014 6:12 am

Speech to Text

Wed Nov 12, 2014 9:46 pm

Hello,

I am a newbie to Pi and am trying to get a speech to text project off the ground and could use some help troubleshooting.

The problem I'm having seems to be a unique error. At this point, for my project, I only want to be able to get to the point of transcribing my speech to text. For the next phase of the project I will run a script that counts the words of a conversation for meetings that we have. (Should be a pretty cool way of quantifying our meetings and doing some interesting comparisons based on who is in the room for different meetings.) For now, though, the problem is this: when I run the script below, I do not get an error. It only says "Processing..." Then, without pressing Control C to stop recording, it immediately jumps to "You said: pi@drcaptain ~$"

Code: Select all

#1/bin/bash
echo "Recording..."
arecord -D "plughw:0,0" -q -f cd -t wav | ffmpeg - loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1

echo "Processing..."
wget -q -U "Mozilla5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/vs/recognize?lang=en-us&client=chromium&key=<MY KEY> | cut -d\" -f12 >stt.txt

echo -n "You said: "
cat stt.txt

rm file.flac > /dev/null 2>$1
I've also tried running different variations of code from the various tutorials available for Google Speech API. Most of them - especially Pi tutorials I've been able to find are for v1. I've searched all over and haven't seen anybody posting about the error that I'm experiencing. Anybody have any idea what's happening here and how I can get my speech to text?

Thanks in advance!!

BMS Doug
Posts: 3824
Joined: Thu Mar 27, 2014 2:42 pm
Location: London, UK

Re: Speech to Text

Wed Nov 12, 2014 11:11 pm

Break your problem down further, get a single script that records some speech.

Use a recorded file from the first stage to test your 2nd script that transcribes to text.

When you have both working independently you can try integrating them into a single script. (But I'm not certain that this would be necessary for your needs).
Doug.
Building Management Systems Engineer.

drcaptain
Posts: 12
Joined: Tue Nov 11, 2014 6:12 am

Re: Speech to Text

Wed Nov 19, 2014 1:40 am

Hi Doug,
Thanks for the suggestion! The speech is recording fine. And play back is fine too (if perhaps a lil scratchy).
Here's the code

Code: Select all

arecord -d 15 -r 48000 FILE.wav
Recording... WAVE 'FILE.wav' : Unsigned 8 bit, Rate 48000 Hz, Mono
aplay FILE.wave
Playing WAVE 'FILE.wav' : Unsigned 8 bit, Rate 48000 Hz, Mono
The problem is, I'm not sure how to get the second one running, am unsure how to test it if it doesn't allow me to record. Can you be a bit more specific about how to test the 2nd script in this manner?

One thing: the microphone I'm using is part of a headset. I'm using a USB adaptor to connect the headset to the RPi. Not sure why this would be an issue with the speech recognition program not running, though.

BMS Doug
Posts: 3824
Joined: Thu Mar 27, 2014 2:42 pm
Location: London, UK

Re: Speech to Text

Wed Nov 19, 2014 4:12 pm

Hi again,

I had a look on google to see if anyone else had done speech to text and it looks like you are in luck, Dave Conroy has a working version, see his blog page for more details.
Let’s create a shell script to handle this process for us.

Code: Select all

sudo nano stt.sh

Code: Select all

echo "Recording your Speech (Ctrl+C to Transcribe)"
arecord -D plughw:0,0 -q -f cd -t wav -d 0 -r 16000 | flac - -f --best --sample-rate 16000 -s -o daveconroy.flac;
 
echo "Converting Speech to Text..."
wget -q -U "Mozilla/5.0" --post-file daveconroy.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12  > stt.txt
 
echo "You Said:"
value=`cat stt.txt`
echo "$value"

Last edited by BMS Doug on Wed Nov 19, 2014 4:32 pm, edited 1 time in total.
Doug.
Building Management Systems Engineer.

BMS Doug
Posts: 3824
Joined: Thu Mar 27, 2014 2:42 pm
Location: London, UK

Re: Speech to Text

Wed Nov 19, 2014 4:28 pm

So I can see that you have used very similar (but not quite identical) code to Dave Conroy's, area's requiring close attention are the points of divergence, for example in the recording section alone I can see several differences.
arecord -D plughw:0,0 -q -f cd -t wav -d 0 -r 16000 | flac - -f --best --sample-rate 16000 -s -o daveconroy.flac;
arecord -D "plughw:0,0" -q -f cd -t wav | ffmpeg - loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1
If you disable the part of your file that is deleting the temporary recording file you will be able to see if it made a successful recording and play it back. (I'd recommend disabling the autodelete of the audio file until you are happy that you have it working).
Doug.
Building Management Systems Engineer.

drcaptain
Posts: 12
Joined: Tue Nov 11, 2014 6:12 am

Re: Speech to Text

Sun Jun 07, 2015 2:20 am

Thanks for this Doug! (And sorry for the l o n g delay in responding. Life diverted me from this project for a bit.)

gtucker19 was also helping me work through STT issues over in this thread viewtopic.php?p=642752#p642752 . In coming back to this project, after having successfully gotten my pi up and running this program, I'm stuck on something new. I run gtucker's code

Code: Select all

echo "Recording your Speech (Ctrl+C to Transcribe)"

arecord -D plughw:0,0 -q -f cd -t wav -r 16000 | flac - -f --best --sample-rate 16000 -s -o file.flac
echo "Converting Speech to Text..."
wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v2/recognize?lang=en-us&client=chromium&key=API key” |cut -d\" -f8 >stt.txt
echo "extract recognized text"
cat stt.txt
echo "You Said:"
value=`cat stt.txt`
echo "$value"
And this is what I get:

Code: Select all

pi@raspberrypi ~ $ ./stt2.sh
Recording your Speech (Ctro+C to Transcribe)
Converting speech to text...
extract recognized text
You said:

pi@raspberrypi ~ $ 
I'll keep you posted as I get back into this project...

Thanks again for the help last time :-)

Karthik N N
Posts: 3
Joined: Thu Mar 29, 2018 11:13 am

Re: Speech to Text

Thu Mar 29, 2018 5:04 pm

Traceback (most recent call last):
File "/home/pi/Nandas/stt.py", line 8, in <module>
audio=r.listen(source)
File "/usr/local/lib/python3.4/dist-packages/speech_recognition/__init__.py", line 652, in listen
buffer = source.stream.read(source.CHUNK)
File "/usr/local/lib/python3.4/dist-packages/speech_recognition/__init__.py", line 161, in read
return self.pyaudio_stream.read(size, exception_on_overflow=False)
File "/usr/local/lib/python3.4/dist-packages/pyaudio.py", line 608, in read
return pa.read_stream(self._stream, num_frames, exception_on_overflow)
KeyboardInterrupt

I am getting this error when executing
import speech_recognition as sr
#import urllib2

r=sr.Recognizer()
with sr.Microphone(device_index = 4, sample_rate = 48000) as source:
while True:
print("say")
audio=r.listen(source)
print("Could notdjgh")

BING_KEY = "36fd56ca622648b2bdc19dfbfb922ef7" # Microsoft Bing Voice Recognition API keys 32-character lowercase hexadecimal strings
try:
print("I think you said " + r.recognize_bing(audio, key=BING_KEY))
except sr.UnknownValueError:
print("I could not understand audio")
except sr.RequestError as e:
print("Could not request results from Microsoft Bing Voice Recognition service; {0}".format(e))

Please Help
Thanks in advance

Return to “Troubleshooting”