Giter Club home page Giter Club logo

localstt's Introduction

LocalSTT

(Jump to Català)

[English]

Note: This application is just a proof of concept for now

LocalSTT is an Android application that provides automatic speech recognition services without needing internet connection as all processing is done locally on your phone.

This is possible thanks to:

  • a RecognitionService wrapping the Vosk library
  • a RecognitionService wrapping Mozilla's DeepSpeech library
  • an Activity that handles RECOGNIZE_SPEECH intents amongst others

The code is currently just a PoC strongly based on:

LocalSTT should work with all keyboards and applications implementing speech recognition through the RECOGNIZE_SPEECH intent or Android's SpeechRecognizer class. It has been successfully tested using the following applications on Android 9:

You can download a pre-built binary with Vosk models for:

and also with DeepSpeech models here:

If you want to use the application with your language just replace the models in app/src/main/assets/sync/vosk-model/ with a package from https://alphacephei.com/vosk/models and rebuild the application.

Build notes:

  • git clone https://github.com/ewheelerinc/LocalSTT.git
  • ./gradlew build
  • ./repack-n-sign.sh ./app/build/outputs/apk/release/app-release-unsigned.apk
    • You might need to update paths and keys in this script for your use.

BUGS:

  • Does not work with Google's keyboard "GBoard".
  • Not all record applications read the voice text properly, there must be another way---and if you know how, it is probably a trivial fix.
  • DeepSpeech models were removed, they didn't build! Maybe it can be fixed?

Future Work

  • Support query alphacephei.com and suppport selection+download of optional models. Then this apk can be packaged without a language (much smaller!).

Demo

LocalSTT in action

[Català]

Nota: Aquesta aplicació de moment només és una prova de concepte

LocalSTT és una aplicació per Android que proporciona reconeixement automàtic de la parla sense necessitat de conexió a internet ja que tot el processament és local al mòbil.

Això és possible gràcies a:

  • un RecognitionService que utilitza la llibreria de Vosk
  • un RecognitionService que utilitza la lliberia de Mozilla Deepspeech
  • una Activity que gestiona intents RECOGNIZE_SPEECH entre altres

El codi és actualment una prova de concepte i es basa fortament en els següents projectes:

LocalSTT hauria de funcionar amb la majoria de teclats i aplicacions que implementen la funció de reconeixement de veu a través d'un intent RECOGNIZE_SPEECH o directament fent servir la classe SpeechRecognizer d'Android. Ha estat provada amb èxit fent servir les següent aplicacions en un terminal Android 9:

Us podeu descarregar un APK que inclou models de Vosk i DeepSpeech pel català aquí.

localstt's People

Contributors

andreytkachenko avatar ccoreilly avatar darthpleurotus avatar ewheelerinc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

nordfalk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.