Giter Club home page Giter Club logo

_speech-to-code's Introduction

Speech To Code Toolkit audit status  Toolkit audit status

Code using your voice

Web Demo

You can try a live demo of Speech2Code here: https://pedrooaugusto.github.io/speech-to-code/webapp

You can also check this video on how to solve the FizzBuzz problem using Speech2Code: https://www.youtube.com/watch?v=I71ETEeqa5E

(for this demo the app was ported to the web, to run directly on the browser)

Overview

Speech2Code is an application that enables you to code using just voice comands, with Speech2Code instead of using the keyboard to write code in the code editor like a caveman you can just express in natural language what you wish to do and that will be automatically written, as code, in the code editor.

Using Speech2Code instead of using the mouse and keyboard to navigate to line 42 of a file, you can just say: "line 42", "go to line 42" or even "please go to line 42". It's possible to say stuff like:

  • new variable answer equals the string john was the eggman string

    •   let answer = "john was the eggman"
  • call function max with arguments variable answer and expression gap plus number 42 on namespace Math

    •   Math.max(answer, gap + 42) // 'gap' can later be replaced later by an actual value

This project can be divided into 3 main modules:

  1. Webapp, Server and Client: are responsible for the application UI, capture audio and transform audio into text.

  2. Spoken: is responsible for testing if a given phrase is a valid voice command and to extract important information out of it (parse).

  3. Spoken VSCode Extension: is a Visual Studio Code extension able to receive commands to manipulate VSCode. Is through this extension that Speech2Code is able to control the Visual Studio Code.

Those modules interact as follows:

flowchart TB
    A[fab:fa-microsoft MS Azure Speech to Text] <-->|HTTP/Sockets| B(Server)

    B <--> |HTTP| C(Client)
    B --> |Serves| E(Webapp)
    C <--> |Inter Process-Communication| D(VS Code Extension)
    E --- |NPM Dependency| F(Spoken)
    C --- |NPM Dependency| F(Spoken)
    D <--> G(Visual Studio Code)

    style B fill:white,stroke:gold,stroke-width:2px
    style C fill:white,stroke:gold,stroke-width:2px
    style D fill:white,stroke:gold,stroke-width:2px
    style E fill:white,stroke:gold,stroke-width:2px
    style F fill:white,stroke:gold,stroke-width:2px

Voice Commands

Voice commands are transformed into text using the Azure Speech to Text service and later parsed by Spoken, which makes use of several pushdown automaton to extract information of the text.

Currently, Speech2Code only supports voice commands for the JavaScript language, a list of all those commands can be found here. All commands can be said in both english and portuguese HU3BR.

Controlling Visual Studio Code

Speech2Code was designed to work with any IDE that implements its interface, this is usually done through plugins and extensions. Currently, it has support for Visual Studio Code and CodeMirror.

For example, the voice command "call function fish with two arguments" will eventually call for editor.write(...) where editor can be any IDE/Editor like vscode, codemirror and sublime and each will have a different implementation for write(...). The only common thing is that calling that function will write something in the current open file, no matter the IDE. Here you have an example of different implementations of the same function: VSCode.write(...) x CodeMirror.write(...)

The connection between VSCode and Speech2Code is done through a custom VSCode extension and Inter-Process Communication.

Running this project

First, install all the required dependencies with:

node scripts.js install

Then, you can start the server with:

./run.sh

A web based demo of Speech2Code will be accessible through: http://localhost:3000/webapp

Finnaly, if you wish to start the actual application run (make sure that VSCode is running before doing that):

npm --prefix client start

Dont forget to edit server/.env with your azure speech-to-text API keys.

Resources

Non code-like material produced in the creation of this project:

  1. Undergratuate dissertation on this project.
  2. Figma design: application screens, icons and images used in the dissertation.
  3. Trello board used before everything went south.

_speech-to-code's People

Contributors

pedrooaugusto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.