Skip to content
This repository has been archived by the owner on Aug 6, 2019. It is now read-only.
/ Rhonda Public archive

Another personal home assistant working with speech recognition

Notifications You must be signed in to change notification settings

Smanar/Rhonda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 

Repository files navigation

Rhonda (outdated)

Rhondha is just another personal home automation assistant working with speech recognition (a Javis like). Developped for Raspberry, it was made to be the lightest/fastest possible with the minimal access to the SD card as possible, for example Rhonda can record sound, convert it to flac and send it to google without using files. Most of the actions are managed by the application entirely in memory or use temporary file storage in RAM. It uses the least amount of resources possibles, energy power and processor (I m using it with a Raspberry B).

All the configuration will be done with one xml file https://github.com/Smanar/Rhonda/blob/master/rhonda/config.xml .
Some actions are hard-coded to be faster and less stressful for SD card but you can use specials shells scripts for personals actions.
Some example of it can do internally, you can find a list here .

  • Speech regnition.
  • Vocal synthesis.
  • Command some 433 Mhz devices (like chacon plug).
  • Use a 8*8 matrix to display some icons or a spectrogram with sound during recording.
  • Check meteo/mails/word definition/cinema program/Github notifications, ...
  • Launch radio streaming.
  • Memorise alert.
  • Send request to another server, to use it in a domotic system.

The engine that process recognized words can manage synonymous, forbidden word, obligatory word, so you are not forced to say exactly the same sentence to trigger event. Take a look on the xml file to see how it works.

So firstly, I know the code is ugly, but this project was written in shell then in C and finaly in C++, with lot of modifications (unicode, wide char, string), it must be entirely rewritten.
And this application isn't multilanguage yet, lot of parts are hard-coded in french. To change engine language or use/translate specials sentences edit the xml file.

You can see somes pictures here https://github.com/Smanar/Rhonda/wiki


It uses:


You need:

  • A raspberry, (I m using a B version) with a working microphone and audio. I have made my test with a dongle that make output and input in same time like this one https://www.adafruit.com/product/1475 I haven't tested it with Jack (I think portaudio need a special compilation with jack enabled).
  • A 8*8 matrix, 3 colors, (around 15/20 euros)
  • A 433Mhz transmitter, (less than 5 Euros)

The hardware part:


The software part:

  • PicotTTS

sudo apt-get install libttspico-utils

  • Atlas matrix computing library

sudo apt-get install libatlas-base-dev


A this moment you have 2 solutions, take the pre-compiled version (generaly out of date), or compile it yourself.

sudo apt-get install flac
sudo apt-get install libflac-dev
sudo apt-get install curl uuid-dev
sudo apt-get install libcurl4-openssl-dev
sudo apt-get install libjack-jackd2-dev libsndfile1-dev libasound2-dev

You need too, buid the portaudio library, but if you don't need special configuration, you will have all the files inside the archive.

There is a code::block project in the source, so you can install it, open the *cdb file and build the project. Be patient, it will take lot of time (the file pugi.xml itself need more than 4/5 mn). But for the next time, you will rebuild only the modified files.

sudo apt-get install codeblocks

It will be realy slow, but it can compile the code.
Or you can use the makefile (TODO)

make

Take care your file will be in bin/Release and you need to copy the "resources" folder and the xml file in the same folder than the created executable.


The configuration

Open the xml file, and change the <api> tag key to set your own key.
To get the api key for Google STT http://www.chromium.org/developers/how-tos/api-keys
To get the api key for Bing STT https://www.microsoft.com/cognitive-services/en-us/subscriptions
Select the speech to text engine you want. And then just run the application.

sudo chmod -R 755 shell
sudo chmod 755 rhonda
./rhonda


How to use it (for unique hotword)

  • After starting you will have a little animation with alien.
  • Say "snowboy"
  • You will hear a "ding" and the matrix will display a microphone (Wait for the microphone before speaking)
  • Say "Rappelles moi dans une heure", the matrix will display a spectrogram with the sound, you can use it to solve problems.
  • The matrix display a horglass during processing.
  • The matrix will display a smiley if all is ok or a "?" if there was a problem.
  • Return at first step.

Remarques speciales

For better performance I use 8820 Hz for sample rate for Bing Engine to have smaller file size, but it works too with other sample rate, you can change it if you have problem in files prog.cpp and STTEngine.cpp

Note pour les utilisateurs francais, le hotword "snowboy" est assez dur a reproduire, du moins pour moi avec mon accent pourri. C'est un des gros probleme de la reconnaissance vocale anglophone. Donc allez plutot sur le site https://snowboy.kitt.ai/ fabriquez vous 2 hotwords, du style "Rhonda" et "tu m'ecoutes ?" Dans le xml mettez

<sound_engine>
    <model>resources/rhonda.pmdl,resources/tumecoutes.pmdl</model>
    <sensibility>0.5,0.5</sensibility>

Ceci pour eviter les faux positifs, en fait au declenchement du premier mot, vous avec 5/6s secondes pour declencher le second, sinon il repart a zero.

Plus d'information pour les francais ici https://github.com/Smanar/Rhonda/wiki

About

Another personal home assistant working with speech recognition

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages