playht / playht-nodejs-sdk Goto Github PK

View Code? Open in Web Editor NEW

40.0 3.0 9.0 37.85 MB

NodeJS SDK to use PlayHT generative AI text-to-speech APIs

License: Other

JavaScript 4.92% TypeScript 92.91% HTML 0.75% CSS 1.09% SCSS 0.32%

playht-nodejs-sdk's Introduction

AI Powered Voice Generation Platform

The PlayHT SDK provides easy to use methods to wrap the PlayHT API.

Table of Contents

Usage
SDK Examples
- Example Server
- ChatGPT Integration Example

Usage

This module is distributed via npm and should be installed as one of your project's dependencies:

npm install --save playht

or for installation with yarn package manager:

yarn add playht

Initialising the library

Before using the SDK, you need to initialise the library with your credentials. You will need your API Secret Key and your User ID. If you already have a PlayHT account, navigate to the API access page. For more details see the API documentation.

Important: Keep your API Secret Key confidential. Do not share it with anyone or include it in publicly accessible code repositories.

Import methods from the library and call init() with your credentials to set up the SDK:

import * as PlayHT from 'playht';

PlayHT.init({
  apiKey: '<YOUR API KEY>',
  userId: '<YOUR API KEY>',
});

Note: All the examples below require that you call the init() method with your credentials first.

When initialising the library, you can also set a default voice and default voice engine to be used for any subsequent speech generation methods when a voice is not defined:

import * as PlayHT from 'playht';

PlayHT.init({
  apiKey: '<YOUR API KEY>',
  userId: '<YOUR API KEY>',
  defaultVoiceId: 's3://peregrine-voices/oliver_narrative2_parrot_saad/manifest.json',
  defaultVoiceEngine: 'Play3.0',
});

Generating Speech

To get an URL with the audio for a generated file using the default settings, call the generate() method with the text you wish to convert.

import * as PlayHT from 'playht';

// Generate audio from text
const generated = await PlayHT.generate('Computers can speak now!');

// Grab the generated file URL
const { audioUrl } = generated;

console.log('The url for the audio file is', audioUrl);

The output also contains a generationId field and an optional message field. generationId is a unique identifier for the generation request, which can be used for tracking and referencing the specific generation job. The optional message field gives additional information about the generation such as status or error messages.

For more speech generation options, see Generating Speech Options below.

Streaming Speech

The stream() method streams audio from text. It returns a readable stream where the audio bytes will flow to as soon as they're ready. For example, to use the default settings to convert text into a audio stream and write it into a file:

import * as PlayHT from 'playht';
import fs from 'fs';

// Create a file stream
const fileStream = fs.createWriteStream('hello-playht.mp3');

// Stream audio from text
const stream = await PlayHT.stream('This sounds very realistic.');

// Pipe stream into file
stream.pipe(fileStream);

The stream() method also allows you to stream audio from a text stream input. For example, to convert a text stream into an audio file using the default settings:

import * as PlayHT from 'playht';
import { Readable } from 'stream';
import fs from 'fs';

// Create a test stream
const textStream = new Readable({
  read() {
    this.push('You can stream ');
    this.push('text right into ');
    this.push('an audio stream!');
    this.push(null); // End of data
  },
});

// Stream audio from text
const stream = await PlayHT.stream(textStream);

// Create a file stream
const fileStream = fs.createWriteStream('hello-playht.mp3');
stream.pipe(fileStream);

For a full example of using the streaming speech from input stream API, see our ChatGPT Integration Example.

For more speech generation options, see Generating Speech Options.

Note: For lowest possible latency, use the streaming API with the Play3.0 model.

Generating Speech Options

All text-to-speech methods above accept an optional options parameter. You can use it to generate audio with different voices, AI models, output file formats and much more.

The options available will depend on the AI model that synthesize the selected voice. PlayHT API supports different types of models: 'Play3.0', 'PlayHT2.0', 'PlayHT2.0-turbo', 'PlayHT1.0' and 'Standard'. For all available options, see the typescript type definitions in the code.

Play3.0 Voices (Recommended)

Our newest conversational voice AI model with added languages, lowest latency, and instant cloning. Compatible with PlayHT2.0 and PlayHT2.0-turbo, our most reliable and fastest model for streaming.

To stream using the Play3.0 model:

import * as PlayHT from 'playht';
import fs from 'fs';

// Create a file stream
const fileStream = fs.createWriteStream('play_3.mp3');

// Stream audio from text
const stream = await PlayHT.stream('Stream realistic voices that say what you want!', {
  voiceEngine: 'Play3.0',
  voiceId: 's3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json',
  outputFormat: 'mp3',
});

// Pipe stream into file
stream.pipe(fileStream);

PlayHT 2.0 Voices

Our newest conversational voice AI model with added emotion direction and instant cloning. Compatible with PlayHT2.0-turbo. Supports english only.

To generate an audio file using a PlayHT 2.0 voice with emotion and other options:

import * as PlayHT from 'playht';

const text = 'Am I a conversational voice with options?';

// Generate audio from text
const generated = await PlayHT.generate(text, {
  voiceEngine: 'PlayHT2.0',
  voiceId: 's3://peregrine-voices/oliver_narrative2_parrot_saad/manifest.json',
  outputFormat: 'mp3',
  temperature: 1.5,
  quality: 'high',
  speed: 0.8,
  emotion: 'male_fearful',
  styleGuidance: 20,
});

// Grab the generated file URL
const { audioUrl } = generated;

console.log('The url for the audio file is', audioUrl);

To stream using the PlayHT2.0-turbo model:

import * as PlayHT from 'playht';
import fs from 'fs';

// Create a file stream
const fileStream = fs.createWriteStream('turbo-playht.mp3');

// Stream audio from text
const stream = await PlayHT.stream('Stream realistic voices that say what you want!', {
  voiceEngine: 'PlayHT2.0-turbo',
  voiceId: 's3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json',
  outputFormat: 'mp3',
  emotion: 'female_happy',
  styleGuidance: 10,
});

// Pipe stream into file
stream.pipe(fileStream);

PlayHT 1.0 Voices

Lifelike voices ideal for expressive and conversational content. Supports english only.

To generate audio with a PlayHT 1.0 voice:

import * as PlayHT from 'playht';

const text = 'Options are never enough.';

// Generate audio from text
const generated = await PlayHT.generate(text, {
  voiceEngine: 'PlayHT1.0',
  voiceId: 'susan',
  outputFormat: 'wav',
  temperature: 0.5,
  quality: 'medium',
  seed: 11,
});

// Grab the generated file URL
const { audioUrl } = generated;

console.log('The url for the audio file is', audioUrl);

Standard Voices

For multi-lingual text-to speech generations, changing pitches, and adding pauses. Voices with reliable outputs and support for Speech Synthesis Markup Language (SSML). Supports 100+ languages.

And an example with standard voice in Spanish:

import * as PlayHT from 'playht';

const text = 'La inteligencia artificial puede hablar español.';

// Generate audio from text
const generated = await PlayHT.generate(text, {
  voiceEngine: 'Standard',
  voiceId: 'Mia',
  quality: 'low',
  speed: 1.2,
});

// Grab the generated file URL
const { audioUrl } = generated;

console.log('The url for the audio file is', audioUrl);

Listing Available Voices

To list all available voices in our platform, including voices you cloned, you can call the listVoices() method with no parameters:

import * as PlayHT from 'playht';

// Fetch all available voices
const voices = await PlayHT.listVoices();

// Output them to the console.
console.log(JSON.stringify(voices, null, 2));

The listVoices() method also takes in an optional parameter to filter the voices by different fields. To get all stock female PlayHT 2.0 voices:

import * as PlayHT from 'playht';

// Fetch stock female PlayHT 2.0 voices
const voices = await PlayHT.listVoices({
  gender: 'female',
  voiceEngine: ['PlayHT2.0'],
  isCloned: false,
});

// Output them to the console.
console.log(JSON.stringify(voices, null, 2));

Instant Clone a Voice

You can use the clone() method to create a cloned voice from audio data. The cloned voice is ready to be used straight away.

import * as PlayHT from 'playht';
import fs from 'fs';

// Load an audio file
const fileBlob = fs.readFileSync('voice-to-clone.mp3');

// Clone the voice
const clonedVoice = await PlayHT.clone('dolly', fileBlob, 'male');

// Display the cloned voice information in the console
console.log('Cloned voice info\n', JSON.stringify(clonedVoice, null, 2));

// Use the cloned voice straight away to generate an audio file
const fileStream = fs.createWriteStream('hello-dolly.mp3');
const stream = await PlayHT.stream('Cloned voices sound realistic too.', {
  voiceEngine: clonedVoice.voiceEngine,
  voiceId: clonedVoice.id,
});
stream.pipe(fileStream);

The clone() method can also take in an URL string as input:

import * as PlayHT from 'playht';
import fs from 'fs';

// Audio file url
const fileUrl = 'https://peregrine-samples.s3.amazonaws.com/peregrine-voice-cloning/Neil-DeGrasse-Tyson-sample.wav';

// Clone the voice
const clonedVoice = await PlayHT.clone('neil', fileUrl, 'male');

// Display the cloned voice information in the console
console.log('Cloned voice info\n', JSON.stringify(clonedVoice, null, 2));

// Use the cloned voice straight away to generate an audio file
const fileStream = fs.createWriteStream('hello-neil.mp3');
const stream = await PlayHT.stream('Cloned voices are pure science.', {
  voiceEngine: clonedVoice.voiceEngine,
  voiceId: clonedVoice.id,
});
stream.pipe(fileStream);

Deleting a Cloned Voice

Use the deleteClone() method to delete cloned voices.

import * as PlayHT from 'playht';

const cloneId = 's3://voice-cloning-zero-shot/abcdefgh-01d3-4613-asdf-9a8b7774dbc2/my-clone/manifest.json';

const message = await PlayHT.deleteClone(cloneId);

console.log('deleteClone result message is', message);

Keep in mind, this action cannot be undone.

SDK Examples

This repository contains an implementation example for the API and an example of integrating with ChatGPT API.

To authenticate requests for the examples, you need to generate an API Secret Key and obtain your User ID. If you already have a PlayHT account, navigate to the API access page. For more details see the API documentation.

Before running the examples, build the SDK:

cd packages/playht
yarn install
yarn build

Example Server

Create a new .env file in the packages/sdk-example folder by copying the .env.example file provided. Then edit the file with your credentials.

To run it locally:

cd packages/sdk-example
yarn
yarn install:all
yarn start

Navigate to http://localhost:3000/ to see the example server.

ChatGPT Integration Example

Create a new .env file in the packages/gpt-example/server folder by copying the .env.example file provided. Then edit the file with your credentials. This example requires your OpenAI credentials too, the the example .env file for details.

To run it locally:

cd packages/gpt-example
yarn
yarn install:all
yarn start

See the full ChatGPT Integration Example documentation.

playht-nodejs-sdk's People

Contributors

Stargazers

Watchers

Forkers

nurdism metasal1 troylinker vibrantvas joshua-shepherd vapiai vi-suji drochetti dkulyk

playht-nodejs-sdk's Issues

Cannot handle errors with init

We can't gracefully handle errors thrown by .init, as the error occurs outside of any flow we have control over.

For example, I would expect this code to handle the error:

  PlayHT.init({
    apiKey: PLAYHT_API_KEY,
    userId,
  });
} catch (e) {
  log.warn({ msg: `PlayHT Quota has elapsed!` });
}

But while PlayHT.init causes the error, the try/catch won't handle it.

As a result, if there is a server running PlayHT, and the PlayHT client cannot be initialized (ex: Quota exceeded), the entire server crashes because there is no way to gracefully handle the initialization. Ideally it could be gracefully handled, so the rest of a server (minus code using PlayHT functions) could still function.

Some suggestions:

Make PlayHT.init asynchronous so its errors can be manually handled gracefully
Do not throw an error on init when Quota has elapsed. Give us 402 errors when we call PlayHT APIs, sure, but don't block init from being handled successfully

Error: Cannot find module 'tls'

Hello, I am following the steps in the documentation.

I am trying to add PlayHT to a project using next.js and webpack 5 but I am getting this error when I tried to call the listVoices method.

I get this error:

Error: Cannot find module 'tls'
...
 ⚠ ./node_modules/playht/dist/esm/index.js
Critical dependency: require function is used in a way in which dependencies cannot be statically extracted

I tried this config but it didn't work:

const nextConfig = {
    webpack: (config) => {
        config.resolve.fallback = { tls: false };
        return config;
    },
};
export default nextConfig;

Do you guys have any ideas what I am doing wrong?

Edit:

I just found this link from someone with the same problem
https://stackoverflow.com/questions/77830195/playht-integration-with-nextjs

Invalid json response body at https://api.play.ht/api/v2/leases

This error is thrown every day and my app is crashed:

2024-01-08 13:29:59.141 +00:00: FetchError: invalid json response body at https://api.play.ht/api/v2/leases reason: Unexpected token < in JSON at position 0
2024-01-08 13:29:59.141 +00:00:     at /home/ubuntu/test-server/node_modules/playht/dist/cjs/index.js:23923:40
2024-01-08 13:29:59.141 +00:00:     at processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-01-08 13:29:59.141 +00:00:     at Client.getLease (/home/ubuntu/test-server/node_modules/playht/dist/cjs/index.js:37091:28)
2024-01-08 13:29:59.141 +00:00:     at Client.refreshLease (/home/ubuntu/test-server/node_modules/playht/dist/cjs/index.js:37110:18)

I'm not sure why it happened and how to handle it. Can someone please help?

Link does not work in main md file for the repo

https://github.com/playht/playht-nodejs-sdk/blob/main/chatgpt-integration-example

Feature: include wordCount, trimSilence and audioDuration

Please include wordCount, trimSilence and audioDuration in response of generated audio, that would be helpful!

Audio Transcription

When will audio transcription "https://docs.play.ht/reference/api-transcribe-audio" be added to this package? I've found this package to be more reliable than using the api directly in my app and need this feature.

generateV2Speech ping event throws json error, aborts generation

Server returns ping event that are not in json, then throws when attempting to parse

event: generating
data: {"id":"--","progress":0,"stage":"queued"}

event: generating
data: {"id":"--","progress":0.01,"stage":"active"}

event: generating
data: {"id":"--","progress":0.01,"stage":"preload","stage_progress":0}

event: generating
data: {"id":"--","progress":0.11,"stage":"preload","stage_progress":0.5}

event: generating
data: {"id":"--","progress":0.21,"stage":"preload","stage_progress":1}

event: ping
data: 2023-10-03T04:11:32.127Z

{ message: 'Unexpected number in JSON at position 4', status: 200 }

Cannot iterate over a consumed stream while running the gpt-example

I am trying to run the gpt-example, but when requesting at the /say-prompt endpoint, the server throws following error:

file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/node_modules/openai/streaming.mjs:34
                throw new Error('Cannot iterate over a consumed stream, use `.tee()` to split the stream.');
                      ^

Error: Cannot iterate over a consumed stream, use `.tee()` to split the stream.
    at Stream.iterator (file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/node_modules/openai/streaming.mjs:34:23)
    at iterator.next (<anonymous>)
    at Readable.<anonymous> (file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/dist/streamGptText.js:40:194)
    at Generator.next (<anonymous>)
    at file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/dist/streamGptText.js:7:71
    at new Promise (<anonymous>)
    at __awaiter (file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/dist/streamGptText.js:3:12)
    at Readable.read [as _read] (file:///{path}/playht-nodejs-sdk/packages/gpt-example/server/dist/streamGptText.js:38:24)
    at Readable.read (node:internal/streams/readable:507:12)
    at maybeReadMore_ (node:internal/streams/readable:661:12)

Node.js v18.18.2
error: script "start" exited with code 1 (SIGHUP)
error: script "start" exited with code 1 (SIGHUP)

Note: I had to update playht to "^0.9.3" from "0.9.0-beta", else it was showing error .stream is not a method

import * as PlayHT from 'playht';
import { Readable } from 'stream';
import fs from 'fs';

PlayHT.init({
  apiKey:
    process.env.PLAYHT_API_KEY ||
    (function () {
      throw new Error('PLAYHT_API_KEY not found in .env file. Please read .env.example to see how to create it.');
    })(),
  userId:
    process.env.PLAYHT_USER_ID ||
    (function () {
      throw new Error('PLAYHT_USER_ID not found in .env file. Please read .env.example to see how to create it.');
    })(),
});

// Create a test stream
const textStream = new Readable({
  read() {
    this.push('You can stream ');
    this.push('text right into ');
    this.push('an audio stream!');
    this.push(null); // End of data
  },
});

// Stream audio from text
const stream = await PlayHT.stream(textStream);

// Create a file stream
const fileStream = fs.createWriteStream('hello-playht.mp3');
stream.pipe(fileStream);

The audio output is not working, seems it is corrupted.

Package Versions:
"openai": "^4.13.0", "playht": "^0.9.3"

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.