Installed speech recognition library: https://pypi.org/project/SpeechRecognition/
Hardware setup for audio and microphone:
Audio Amp: MAX98357 (i2S)
Microphone: SPH0645LM4H (i2S)
Followed the guide on : https://www.hackster.io/maremoto/snips- ... afe-c3c178 to setup this 2 modules
(thanks maremoto for the awesome guide)
Tried to record and playback audio files with arecord and aplay, able to record and playback the audio file, set volume to the maximum level, but sound output from the speaker is still quite soft, but still acceptable
I tested with Linphone console application to initiate voice call, the microphone and speaker are able to work well, and voices going into the i2S are sharp and clear.
Now I am working on speech recognition and it has to listen live audio input from the i2S microphone, the program can only detect my voice input when my mouth is directly beside the microphone. Looks like the voice isn't captured when I am not talking close to the microphone, I've adjusted the microphone gain to about 80%, but still no luck.
I tried to get the list of device_index and i got:
Code: Select all
Microphone with name "sndrpisimplecar: - (hw:0,0)" found for `Microphone(device_index=0)`
Microphone with name "sndrpisimplecar: - (hw:0,1)" found for `Microphone(device_index=1)`
Microphone with name "sysdefault" found for `Microphone(device_index=2)`
Microphone with name "sharedmic_sv" found for `Microphone(device_index=3)`
Microphone with name "sharedmic" found for `Microphone(device_index=4)`
Microphone with name "dsnooped" found for `Microphone(device_index=5)`
Microphone with name "speaker_sv" found for `Microphone(device_index=6)`
Microphone with name "speaker" found for `Microphone(device_index=7)`
Microphone with name "dmixed" found for `Microphone(device_index=8)`
Microphone with name "dmix" found for `Microphone(device_index=9)`
Microphone with name "default" found for `Microphone(device_index=10)`
Another method I tried is to pre-record a .wav file using arecord and then input the recorded audio file into the speech recognition library, this method works well and very accurate without having to put my mouth directly beside the microphone.
My /etc/asound.conf file (got it from maremoto guide):
Code: Select all
pcm.!default {
type asym
capture.pcm "sharedmic_sv"
playback.pcm "speaker_sv"
}
pcm.sharedmic_sv {
type softvol
slave.pcm "sharedmic"
control {
name "Boost Capture Volume"
card sndrpisimplecar
}
min_dB -2.0
max_dB 20.0
}
pcm.sharedmic {
type plug
slave.pcm "dsnooped"
}
pcm.dsnooped {
type dsnoop
ipc_key 777777
ipc_perm 0677
slave {
pcm "hw:0,1"
channels 2
format S32_LE
}
}
pcm.speaker_sv {
type softvol
slave.pcm "speaker"
control.name "Speaker Volume"
control.card 0
}
pcm.speaker{
type plug
slave.pcm "dmixed"
}
pcm.dmixed {
type dmix
ipc_key 888888
ipc_perm 0677
#ipc_key_add_uid false
slave {
pcm "hw:0,0"
period_time 0
period_size 1024
buffer_size 8192
rate 44100
channels 2
}
bindings {
0 0
1 1
}
}
Any ideas where should I look into the alsa config folders or maybe PyAudio so that the voice recognition program is able to capture my voice normally? At the code line (audio = r.listen(source))