dreamdom / jsspeechrecognizer Goto Github PK

View Code? Open in Web Editor NEW

242.0 23.0 28.0 2.06 MB

JavaScript Speech Recognizer

License: Apache License 2.0

JavaScript 72.17% HTML 18.36% CSS 9.48%

jsspeechrecognizer's Introduction

JsSpeechRecognizer

JavaScript Speech Recognizer

Demos

Speech Recognition Demo

Keyword Spotting Demo

Video Interaction Live Demo

Video

Here is a short video of the keyword spotting demo.

And here is a short video of the video interaction demo.

What is It?

JsSpeechRecognizer is a javascript based speech recognizer. It allows you to train words or phrases to be recognized, and then record new audio to match to these words or phrases.

At the moment, JsSpeechRecognizer does not include any data model, so you will have to train new words before using it.

How Does it Work?

WebRTC

JsSpeechRecognizer uses browser WebRTC functionality to get access to the microphone and Fast Fourier Transform (fft) data. Therefore, it will only work in browsers with WebRTC support.

The WebRTC adapter javascript is needed to use the JsSpeechRecognizer. It is hosted on github here. https://github.com/webrtc/adapter

JsSpeechRecognizer.js

This file contains all of the specific speech recognizer logic.

Detailed Write Up

For a more detailed write up on how the JsSpeechRecognizer was built click here.

Live Demo

Play with the Live Demo here. It has only been tested in Firefox and Chrome.

Screenshots

Tips for the Live Demo

Try training the word "yes", and then training the word "no".
It is recommended that you train and test in a quiet room.
You can (and should) train a word multiple times. This is especially important if you are trying to recognize words that sound very similar such as "no" and "go".
Use the "play" button to hear the audio data that was recorded. You should verify that a recording in the training set is of good quality and is of the correct word.
If a recording is incorrect, of bad quality, or contains too much noise get rid of it with the "delete" button.

Fun Stuff

Try training phrases like "find sushi" or "show me coffee in San Francisco"
Train and detect laughing or screaming.
Use emoticons like 🐔, instead of words.
Train the recognizer with one person, and test with another person.

More Demos

Find information about more demos here.

I would love to hear more ideas!

Running the Demos on Your Own Machine

The demo speechrec.html lets you train new words and then recognize them.

Running in Firefox

Simply open the file speechrec.html. You should get a popup from the browser asking you if you would like to grant the site permission to use the microphone.

Running in Chrome

If the speechrec.html file is opened as a local file (with a file:/// prefix) the demo will not work by default due to security settings. You can either disable the security (temporarily) or set up a local server to test the file.

I recommend using a Python SimpleHTTPServer. Open up a terminal, cd to the proper folder you want to host, and run the following command:

Python 2

python -m SimpleHTTPServer 8000

Python 3

python -m http.server 8000

Open up "localhost:8000" in your browser to see the list of files in the folder being shared. For more details see the python documentation. https://docs.python.org/2/library/simplehttpserver.html

Other alternatives include browser-sync or webpack-dev-server.

For more details about Chrome and webrtc locally, see the following stack overflow question: http://stackoverflow.com/questions/14318319/webrtc-browser-doesnt-ask-for-mic-access-permission-for-local-html-file

Other Browsers

I have not tested other browsers.

jsspeechrecognizer's People

Contributors

Stargazers

Watchers

jsspeechrecognizer's Issues

SimpleHTTPServer (Python 2) and http.server (Python 3)

README.md minor update to include running SimpleHTTPServer on Python 3

Python 2

python -m SimpleHTTPServer 8000

Python 3

python -m http.server 8000

[Idea] Continuous voice recognition

Hi !

First, your project is awesome :)

Do you think it could be possible to listen continuously without having to click the button ? It could enable "keyword spotting", like for example you're waiting for a special keyword. Imagine you want to build an Alexa like assistant, you can wait for "Alexa", "Jarvis", "Siri", or whatever you want, and then when the keyword is detected locally you start real voice recognition of the browser to detect exactly what was the command.

In fact I'm working on a project called Gladys, an assistant who can control your home, and I'm currently looking for a keyword spotting tool to recognize the word "Gladys". ( http://gladysproject.com ).

Thank you again for this awesome work :)

Better API Doc

This is a very interesting project I see here, but to be honest, this has terrible API documentation. Maybe add a documentation on APIs of JsSpeechRecognizer.js? That would be extremely helpful, even more than a demo.

Keyspotting : using localstorage to save the training set ?

Hi,

I did some tests with the keyspotting feature and it is really nice !
I really would like to implement this in the web client of Domogik, an open source home automation solution.
Domogik has its own virtual assistant, as the famous Jarvis to do various operations : control the home, discuss a little, learn some things, ... Example with the android client : https://www.youtube.com/watch?v=IXXahef0bNY

In the web application, I implemented TTS and STT feature, but the keyspotting feature still needs to be added and your project seems to be the solution for me.

By the way, for now, we need to train each time we reload the page. Do you have any plan to use some localstorage in your library to store the training set ?

Once again, goog job!

++
Fritz

Awesome project

Opened an issue just to say that this project is wonderful. I haven't had the time to test it, but if it works it's such a great thing. Congratulations!

How do I interpret the output values?

Hi.

Awesome project! One question though, how do I know it knows what I'm saying?
What does the double value represent?

Detecting noice (blowing)

Awesome tool! Now I have a case where I want to recognise blowing.
We are making a christmas donation application where you need to power a mindmill by blowing.

I'm tweaking some of the parameters but I can't quite get it to work.
Anyone any ideas how to tweak so I can record blowing?

Save current value of Trained Word but how i can setup initial value of speechRec.recordingBufferArray

-----------------save to host--------------------

var recordingId = speechRec.stopRecording();
$.ajax({
   method: "POST",
   url: "/sound.ins",
   dataType: "script",
    data: {
          PSES:U9_SESSION_ID,         
          PLNG:'en',
          PSCR_C:speechRec.recordingBufferArray[finalPlaybackId]
         }
   })
   .fail(function() {
    })
   .done(function( msg ) {
     null;
   });

---------------- get from host--------------

get array value from host but how i can setup up to listen my keyword to library****

var r='/sound.getjson?PSES=12&PLNG=en&PTS='+p_pts;
$.getJSON(r, function(jsdata) {
    $.each(jsdata.ROWSET, function(i, l) { 
       speechRec.recordingBufferArray.push(l.SCR_C);
      console.log(l.SCR_C);
    });
});

Keyword Spotting does not work with cyrillic

Hey, seems like a great project! I was banging my head against pocketsphinx.js as kws recognition software, but.. well, a lot of problems, won't even go into all of them.

And this one seems to just work. I'll try using it with my voice assistant with web-ui and python backend.

But there is a minor problem. It seems that if I enter non-latin characters into your Keyword demo, it does not work. However if I input any word in latin and train it with russian or any other languge, well, obviously it works. So I think there is some minore problem with the code not being able to process non-ascii strings here, However it works in Speech Recognition demo! So thats a bit weird.