Giter Club home page Giter Club logo

nanotts's Introduction

NanoTTS

Speech synthesizer commandline utility that improves pico2wave, included with SVOX PicoTTS

Update, December 2018

  • Cleaned up the interface. All outputs must be explicitly specified now.
  • Most inputs are mandatorily specified, all but trailing words. nanotts "trailing words" counts as an input, having the same effect as nanotts -i "trailing words", the only difference being in the former the input must be entirely trailing. Switches following may be ignored or improperly handled.
  • Removed the mandatory libao linkage which has caused problems on systems that don't include it in the default distribution packages.
  • Changed the playback module to ALSA.
  • Alsa linkage is optional. make noalsa builds without alsa. WAVE output still functions.
  • All outputs can be multiplexed at the same time. You can literally stream the bytes, write a WAVE and playback the stream at the same time. nanotts -w -p -c accomplishes this.

Planned

  • Windows Build

Usage

usage: nanotts [options]
   -h, --help           Displays this help. (overrides other input)
   -v, --voice <voice>  Select voice. (Default: en-GB)
   -l <directory>       Set Lingware voices directory. (defaults: "./lang", "/usr/share/pico/lang/")
   -i <text>            Input. (Text must be correctly quoted)
   -f <filename>        Filename to read input from
   -o <filename>        Write output to WAV/PCM file (enables WAV output)
   -w, --wav            Write output to WAV file, will generate filename if '-o' option not provided
   -p, --play           Play audio output
   -m, --no-play        do NOT play output on PC's soundcard
   -c                   Send raw PCM output to stdout
   --speed <0.2-5.0>    change voice speed
   --pitch <0.5-2.0>    change voice pitch
   --volume <0.0-5.0>   change voice volume (>1.0 may result in degraded quality)

Possible Voices:
   en-US, en-GB, de-DE, es-ES, fr-FR, it-IT

Examples:
   nanotts -f ray_bradbury.txt -o ray_bradbury.wav
   echo "Mary had a little lamb" | nanotts --play
   nanotts -i "Once upon a midnight dreary" -v en-US --speed 0.8 --pitch 1.8 -w -p
   echo "Brave Ulysses" | nanotts -c | play -r 16k -L -t raw -e signed -b 16 -c 1 -

Goal


Rewrite pico2wave front-end into something more user friendly.

Ideally, add features to aid automatic parsing of large text-files (50k+ words) into small batches of automatically named .wav or .mp3 files. The goal is to aid in the structured digestion of papers/articles/books, but also to make more versatile for many other speech synthesization uses as well.

Steps:

  • get a bare-bones working implementation of picotts, sans cruft
  • create cmdline file:
  • implement cmdline switches that do:
    • print detailed help (-h, --help)
    • reads WORDS from stdin (default, if no other input modes detected)
    • reads WORDS from cmdline (-w )
    • reads WORDS from file (-f )
    • writes WAVE to file (-o )
    • silence device pcm playback (--no-play|-m)
    • cleanup printed output
    • select voice (-v )
    • writes PCM-data to stdout (-c)
    • set voice files (lingware) path (-l )
    • set speed
    • set volume
    • set pitch
    • progress meter to stderr
    • playback keys: spacebar, left+right arrow, ESC, +/- (playback speed)
    • run through: gprof, valgrind
    • write man page ; and make install
    • autonaming func
    • -q flag to silence output to {stdout, stderr}
    • catch signals to cancel PCM playback/output cleanly
    • confirm working on both Mac and Linux
  • extra:
    • able to read multiple files at once (-files [file2][file3][..])
    • limit text input to N lines
    • bit-rate, frequency, channel, parms for .wav
    • mp3 output
    • store base settings in $HOME/.config type file, so you dont have to type language prefs every time
    • advanced feature to carve up large text-file into set of auto-named .mp3, supporting -p
    • search & replace, useful for replacing certain problem characters, such as '-' (pico says "hyphen") that can ruin the flow of a book, so replace '-' with ',' which pico interprets instead as a pause.

MP3 PIPE example

echo "eenie meany miny moh" | ./nanotts -c | lame -r -s 16 --bitwidth 16 --signed --little-endian -m m -b 32 -h - out.mp3

I know what you're thinking--mp3 is a mess. And you would be right to think that. Basically, because it's raw PCM, you have to tell lame exactly what format to expect. But hey, at least right now mp3 is automatable!

email: greg AT naughton DOT org

nanotts's People

Contributors

gmn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nanotts's Issues

-o option has to be first

when issuing command from commandline, if -o option is last, filename is used as spoken text.

nanotts -w "hello" -o 1.wav produces "one dot wave" in the audio file.

nanotts -o 1.wav -w "hello" produces "hello" in the 1.wav audio file, which is what is expected.

Examples

Could someone upload an example here please?

Extra voices?

First of all I wanted to thank you for your work. NanoTTS has been great.

A question I have is: are there other voices to be found? Specifically, other accents or male voices. Or is this just a matter of changing the pitch?

Assertion `pcm' failed.

I have some users of the Voco add-on for the Mozilla WebThings Gateway, which uses NanoTTS, which are reporting a PCM error that I am unable to reproduce on my own Pi 4.

2020-05-06 21:27:36.677 ERROR : voco: error: opening pcm device failed No such file or directory
2020-05-06 21:27:36.679 ERROR : voco: nanotts: pcm.c:736: snd_pcm_stream: Assertion `pcm' failed.

(note the two different quotation marks by the way)

This is on a Pi 4. Perhaps they are using USB audio output hardware.

This is the code that calls NanoTTS:

    def speak(self, voice_message="",site_id="default"):
        try:
            # TODO add an environment variable here to set alsa to the USB output device?
            
            
            my_env = os.environ.copy()
            my_env["ALSA_CARD"] = str(self.playback_card_id)
            
            ps = subprocess.Popen(('echo', str(voice_message)), stdout=subprocess.PIPE)
            output = subprocess.check_output((str(os.path.join(self.snips_path,'nanotts')), '-l',str(os.path.join(self.snips_path,'lang')),'-v',str(self.voice),'--speed','0.9','--pitch','1.2','-p'), stdin=ps.stdout, env=my_env)
            ps.wait()
            
        except Exception as ex:
            print("Error speaking: " + str(ex))

So it could be that they are using USB audio output devices, and the environment variables aren't set properly for that case.

I noticed there was a way of compiling NanoTTS without ALSA. Would that perhaps make it more universal/robust/agnostic?

load pil file

Hello gmn,
First of all I think that this is a really great project.
Is it possible to load pil files. I own the linguatec voice reader based on SVOX. The included voice should be Petra (German).

I have a gl0co0d22_0.pil which seems to be the voicefile
and a svox.pil that I identified as Lingwarefile.

The normal name of the file should probably be svox-gl0co0de-DE22.pil. So at the moment I do not know why I have 2 files.

best regards
Icntaann

Build on Ubuntu 22.04

If make fails with

fatal error: alsa/asoundlib.h: No such file or directory

use make noalsa instead. (yes, that's in the readme already)

make issues many warnings

warning: ISO C++17 does not allow ‘register’ storage class specifier [-Wregister]

editing the Makefile to read

CFLAGS = -Wall -Wregister

fixed the warnings.

Trouble executing nanotts via Node.js child_process.exec

Trying to talk to nanotts from Node.JS i get an error regardless of how I format the command string:

exec("nanotts -c -i \"hello\"", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts -c -i "hello"
 **error: multiple inputs
exec("nanotts -c -i hello", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts -c -i hello
 **error: multiple inputs
exec("nanotts -c hello", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts -c hello
 **error: trailing commandline arguments
exec("nanotts hello", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts hello
 **error: trailing commandline arguments
exec("nanotts -i \"hello\" -c", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts hello
 **error: trailing commandline arguments
exec("nanotts -i 'hello' -c", function (err, stdout, stderr) {
...
});
Error: Command failed: nanotts -i 'hello' -c
 **error: multiple inputs

How should I format my command line to avoid this?

Pegging old versions

I use docker to grab and build nanotts along with a few other bits and pieces for a project written in python. This project had originally been written and compiled in a docker container a few years ago.

A few days ago, I had reason to run the build process again, only to find that the wonderful nanotts had broken, switches changed (w to i) and segmentation faults whenever utilised. The switch of spoken words from -w to -i was easy to deal with, but the segmentation faults survived.

Running nanotts from a command prompt seemed to be fine. Within the python program I used:-

subprocess.Popen(nanotts_command, shell=True, stdout=subprocess.PIPE).stdout.read()

This now always causes a:-

Segmentation fault (core dumped)

Running it by hand from a python command line seems to work, just not within the program. Something has changed to prevent the latest version from functioning as the old one did.

I managed to get it working by going to the previous version https://github.com/gmn/nanotts/tree/e3165556ec2ab26b4f42fe9eab652006704aefd0 and wonder if pegging the urls in the readme for major changes could save some headaches for one or two people in future.

Thanks for your work on nanotts, I get verbal alerts about various things everyday as a result of your efforts.

Android release, maybe on F-Droid

Hi!
Have you considered to release this fork under android?

Would be perfect to release it on F-Droid (Free and opensource market/store app).

PicoTTS on AOSP is dead, so it will be very good to have your fork on new Android version.

Cheers,
Paolo

Doesn't compile on ubuntu 16.04 (missing ao/ao.h)

g++ -D_PICO_LANG_DIR="/usr/share/pico/lang/" -I. -I./svoxpico -Wall -g -O2 -c src/mmfile.cpp -o objs/mmfile.o
g++ -D_PICO_LANG_DIR="/usr/share/pico/lang/" -I. -I./svoxpico -Wall -g -O2 -c src/nanotts.cpp -o objs/nanotts.o
src/nanotts.cpp:32:19: fatal error: ao/ao.h: No such file or directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.