Giter Club home page Giter Club logo

speechsynthesisrecorder's Introduction

SpeechSynthesisRecorder.js

Use navigator.mediaDevices.getUserMedia() and MediaRecorder to get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, ReadableStream, or other object or data types, see MediaStream, ArrayBuffer, Blob audio result from speak() for recording?.

Install

Add the following script tag

<script type="text/javascript" src="https://unpkg.com/[email protected]/SpeechSynthesisRecorder.js"></script>

or npm install

$ npm install --save speech-synthesis-recorder

Usage

Select Monitor of Built-in Audio Analog Stereo option instead of Built-in Audio Analog Stereo option at navigator.mediaDevices.getUserMedia() prompt.

let ttsRecorder = new SpeechSynthesisRecorder({
  text: "The revolution will not be televised", 
  utteranceOptions: {
    voice: "english-us espeak",
    lang: "en-US",
    pitch: .75,
    rate: 1
  }
});

ArrayBuffer

ttsRecorder.start()
  // `tts` : `SpeechSynthesisRecorder` instance, `data` : audio as `dataType` or method call result
  .then(tts => tts.arrayBuffer())
  .then(({tts, data}) => {
    // do stuff with `ArrayBuffer`, `AudioBuffer`, `Blob`,
    // `MediaSource`, `MediaStream`, `ReadableStream`
    // `data` : `ArrayBuffer`
    tts.audioNode.src = URL.createObjectURL(new Blob([data], {type:tts.mimeType}));
    tts.audioNode.title = tts.utterance.text;
    tts.audioNode.onloadedmetadata = () => {
      console.log(tts.audioNode.duration);
      tts.audioNode.play();
    }
  })

AudioBuffer

ttsRecorder.start()
  .then(tts => tts.audioBuffer())
  .then(({tts, data}) => {
    // `data` : `AudioBuffer`
    let source = tts.audioContext.createBufferSource();
    source.buffer = data;
    source.connect(tts.audioContext.destination);
    source.start()
  })

Blob

ttsRecorder.start()
  .then(tts => tts.blob())
  .then(({tts, data}) => {
    // `data` : `Blob`
    tts.audioNode.src = URL.createObjectURL(blob);
    tts.audioNode.title = tts.utterance.text;
    tts.audioNode.onloadedmetadata = () => {
      console.log(tts.audioNode.duration);
      tts.audioNode.play();
    }
  })

ReadableStream

ttsRecorder.start()
  .then(tts => tts.readableStream())
  .then(({tts, data}) => {
    // `data` : `ReadableStream`
    console.log(tts, data);
    data.getReader().read().then(({value, done}) => {
      tts.audioNode.src = URL.createObjectURL(value[0]);
      tts.audioNode.title = tts.utterance.text;
      tts.audioNode.onloadedmetadata = () => {
        console.log(tts.audioNode.duration);
        tts.audioNode.play();
      }
    })
  })

MediaSource

ttsRecorder.start()
  .then(tts => tts.mediaSource())
  .then(({tts, data}) => {
    console.log(tts, data);
    // `data` : `MediaSource`
    tts.audioNode.srcObj = data;
    tts.audioNode.title = tts.utterance.text;
    tts.audioNode.onloadedmetadata = () => {
      console.log(tts.audioNode.duration);
      tts.audioNode.play();
    }
  })

MediaStream

let ttsRecorder = new SpeechSynthesisRecorder({
  text: "The revolution will not be televised", 
  utternanceOptions: {
    voice: "english-us espeak",
    lang: "en-US",
    pitch: .75,
    rate: 1
  }, 
  dataType:"mediaStream"
});
ttsRecorder.start()
  .then(({tts, data}) => {
    // `data` : `MediaStream`
    // do stuff with active `MediaStream`
  })
  .catch(err => console.log(err))

Demo

plnkr

speechsynthesisrecorder's People

Contributors

guest271314 avatar yerkopalma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

speechsynthesisrecorder's Issues

Not working on latest versions of chrome 71

I see that this does not work on latest version of chrome 71 because chrome 66 onwards, audiocontext() can be called only after user intervention, for example button click. I did that change by adding a button onclick, but then I was hit by DOMException in start method.

audioBuffer() not working in chrome

.then(ab => this.audioContext.decodeAudioData(ab))

Uncaught (in promise) TypeError: Failed to execute 'decodeAudioData' on 'BaseAudioContext': parameter 1 is not of type 'ArrayBuffer'.

'audiooutput' does not mean system audio output

        .then(stream => navigator.mediaDevices.enumerateDevices()
        .then(devices => {
          const audiooutput = devices.find(device => device.kind == "audiooutput");
          stream.getTracks().forEach(track => track.stop())
          if (audiooutput) {
            const constraints = {
              deviceId: {
                exact: audiooutput.deviceId
              }
            };
            return navigator.mediaDevices.getUserMedia({
              audio: constraints
            });
          }
          return navigator.mediaDevices.getUserMedia({
            audio: true
          });
        }))

does not actually select an audio output device https://bugs.chromium.org/p/chromium/issues/detail?id=1114422#c7.

Firefox Uncaught (in promise) NavigatorUserMediaError {name: "TrackStartError", message: "", constraintName: ""}

Steps to reproduce:

  1. Call navigator.getUserMedia({audio:true})
  2. Set Monitor Built-in Audio Analog Stereo at RecordStream option at Recording tab of system Sound Settings
  3. Call MediaRecorder with MediaStream from navigator.getUserMedia({audio:true}) call as

Operating System: Linux 4.8.0-54-lowlatency

Firefox version: 53.0.3 (32-bit)

Actual results:

The resulting Blob of recorded media is played at HTMLMediaElement contains reverb and input from system microphone. When the page is refreshed and permission is granted again for user media, error Uncaught (in promise) NavigatorUserMediaError {name: "TrackStartError", message: "", constraintName: ""}.

RecordStream option set to Monitor of Built-in Audio Analog Stereo set at OS connection is removed from OS Sound Settings GUI.

Refresh again repeating steps above results in Firefox closing.

If both Chromium and Firefox are tried using above settings, after Firefox closes Chromium receives error Uncaught (in promise) NavigatorUserMediaError {name: "TrackStartError", message: "", constraintName: ""}.

Expected results:

At Chromium 58 MediaRecorder records the output to speakers, without reverb or input from system microphone, and does not remove option from system Sound Setting or close browser.

https://bugzilla.mozilla.org/show_bug.cgi?id=1373364

Microsoft "Natural" voices are not captured

When setting utteranceOptions.voice to a "Natural" voice, the resulting audio contains only silence.

For example, these are the default voices that exist on an unconfigured installation of Microsoft Edge:

Microsoft Edge Voices
Microsoft David - English (United States)
Microsoft Mark - English (United States)
Microsoft Zira - English (United States)
Microsoft Natasha Online (Natural) - English (Australia)
Microsoft William Online (Natural) - English (Australia)
Microsoft Clara Online (Natural) - English (Canada)
Microsoft Liam Online (Natural) - English (Canada)
Microsoft Sam Online (Natural) - English (Hongkong)
Microsoft Yan Online (Natural) - English (Hongkong)
Microsoft Neerja Online (Natural) - English (India) (Preview)
Microsoft Neerja Online (Natural) - English (India)
Microsoft Prabhat Online (Natural) - English (India)
Microsoft Connor Online (Natural) - English (Ireland)
Microsoft Emily Online (Natural) - English (Ireland)
Microsoft Asilia Online (Natural) - English (Kenya)
Microsoft Chilemba Online (Natural) - English (Kenya)
Microsoft Mitchell Online (Natural) - English (New Zealand)
Microsoft Molly Online (Natural) - English (New Zealand)
Microsoft Abeo Online (Natural) - English (Nigeria)
Microsoft Ezinne Online (Natural) - English (Nigeria)
Microsoft James Online (Natural) - English (Philippines)
Microsoft Rosa Online (Natural) - English (Philippines)
Microsoft Luna Online (Natural) - English (Singapore)
Microsoft Wayne Online (Natural) - English (Singapore)
Microsoft Leah Online (Natural) - English (South Africa)
Microsoft Luke Online (Natural) - English (South Africa)
Microsoft Elimu Online (Natural) - English (Tanzania)
Microsoft Imani Online (Natural) - English (Tanzania)
Microsoft Libby Online (Natural) - English (United Kingdom)
Microsoft Maisie Online (Natural) - English (United Kingdom)
Microsoft Ryan Online (Natural) - English (United Kingdom)
Microsoft Sonia Online (Natural) - English (United Kingdom)
Microsoft Thomas Online (Natural) - English (United Kingdom)
Microsoft Aria Online (Natural) - English (United States)
Microsoft Ana Online (Natural) - English (United States)
Microsoft Christopher Online (Natural) - English (United States)
Microsoft Eric Online (Natural) - English (United States)
Microsoft Guy Online (Natural) - English (United States)
Microsoft Jenny Online (Natural) - English (United States)
Microsoft Michelle Online (Natural) - English (United States)
Microsoft Roger Online (Natural) - English (United States)
Microsoft Steffan Online (Natural) - English (United States)

ย 
The first 3 voices record as expected, but none of the subsequent "Natural" voices are captured.

Is there an additional step that must be taken in order for these voices to be captured?

Uncaught TypeError: Failed to set the 'volume' property on 'SpeechSynthesisUtterance': The provided float value is non-finite.

error in Chrome 83.

SpeechSynthesisRecorder.js:45 Uncaught TypeError: Failed to set the 'volume' property on 'SpeechSynthesisUtterance': The provided float value is non-finite.
    at Function.assign (<anonymous>)
    at new SpeechSynthesisRecorder (SpeechSynthesisRecorder.js:45)
    at <anonymous>:1:1

run code.

new SpeechSynthesisRecorder({
    text: 'The revolution will not be televised',
    utteranceOptions: {
        voice: 'english-us espeak',
        lang: 'en-US',
        pitch: 0.75,
        rate: 1,
    },
})
    .start()
    .then((tts) => tts.blob())
    .then(({ tts, data }) => {
        // `data` : `Blob`
        tts.audioNode.src = URL.createObjectURL(data);
        tts.audioNode.title = tts.utterance.text;
        tts.audioNode.onloadedmetadata = () => {
            console.log(tts.audioNode.duration);
            tts.audioNode.play();
        };
    });

Do not add elements to the DOM

Here you are adding an audio element to the DOM. IMO, an API like this should be less opinionated, in fact the whole audioNode property could be dropped.

This is again recording from microphone, not from audiooutput device

Since this was not working on latest chrome 71, I downgraded to chrome 60. I see that this program is recording from microphone instead from speechSynthesis.speak(). I feel the reason is because both audioinput and audiooutput have same deviceId="default". So how can I make it record from speak() ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.