speechDown

This is a speech to text and text to speech app. It is designed to have a minimal interface and to be as simple as possible to use. It has the following features:

Speech to text dictation functionality using vosk.
Text to speech functionality using Google text to speech.
Optical character recognition using Tesseract.
Mark down preview using Marked.

Compatibility

This app is only available on MacOS Linux and Windows. With x86-64 CPU and as much RAM and as much CPU performance as possible.

Installation

Currently there are no pre-Built packages, however this will change in the future.

In order to install this up you can clone this repository and make sure that nodeJS is installed.

You can do this using the following commands:

git clone https://github.com/MaxAFriedrich/speechDown
npm install
npm start

In order to then use this app you can simply run npm start in the root of the repository. You may wish to create a shortcut for this.

Setting up Google text to speech

To use Google text to speech you need to set it up using the Google cloud platform. Here are some instructions on how to do this, which you also find in the app itself.

To use this app's text to speech capability, you need to set up authentication so this app can use Google Text-to-Speech. You can find instructions on how to do this here. NOTE: Do not "Set your authentication environment variable". Once you have completed this, tell this app where you have stored the JSON file you downloaded is.

Usage

Here are some general point is on the best way to use this software.

You can only read scan or edit text in the code panel.
The preview panel can only preview text and has no editing capabilities whatsoever.

Controls

Once you have opened the app all of the controls are along the top of the screen.

Control	Shortcut	Function
Save	`Ctrl + S`	Save the current file.
Open	`Ctrl + O`	Open a new file.
New File	`Ctrl + N`	Create a new file.
Settings	`Ctrl + ,`	Open the settings.
Dictate	`Ctrl + D`	Dictate some text.
Scan	`Ctrl + E`	Scan a image to text.
Read	`Ctrl + R`	Read some text out.
Code View	`Ctrl + 1`	View just the markdown code.
View code and preview	`Ctrl + 2`	View both the markdown and code.
Preview only	`Ctrl + 3`	View just a html render preview of the markdown.