Giter Club home page Giter Club logo

auto-subtitles's Introduction

Subtitle Generator and API

Automatically generate subtitles from an input audio or video file using Open AI Whisper.

Badge OSC

Setup

Requirements

The following environment variables can be set:

OPENAI_API_KEY=<your-openapi-api-key>
AWS_REGION=<your-aws-region> (optional can also be provided in payload)
AWS_ACCESS_KEY_ID=<your-aws-access-key-id> (optional, only needed when uploading to S3)
AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key> (optional, only needed when uploading to S3)

Using an .env file is supported. Just rename .env.example to .env and insert your values.

FFmpeg

FFmpeg is required to convert the input file/url to a format that Open AI Whisper can process. You can download it from here.

Installation / Usage

Starting the service is as simple as running:

npm install
npm start

A docker image and docker-compose are also available:

docker-compose up --build -d

The transcribe service is now up and running and available on port 8000.

Endpoints

Available endpoints are:

Endpoint Method Description
/ GET Heartbeat endpoint of service
/transcribe POST Create a new transcribe job. Provide url in body
/transcribe/s3 POST Create a new transcribe job and upload result to s3

Example requests

To start a new transcribe job send a POST request to the /transcribe endpoint with :

{
  "url": "https://example.net/vod-audio_en=128000.aac"
  "language": "en" // ISO 639-1 language code (https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) (optional)
  "format": "vtt" // Supported formats: json, text, srt, verbose_json, or vtt (optional)
}

The response will look like this where result is the WEBVTT file as a string:

{
  "workerId": "BFabbcCi3IYuWOj6LfsgK",
  "result": "WEBVTT\n\n00:00:00.000 --> 00:00:04.180\nor into transcoding I mean, I could probably add just the keyframe in the start and just\n\n00:00:04.180 --> 00:00:06.920\nskip I-frames and the rest of that.\n\n"
}

Formatted output:

WEBVTT

00:00:00.000 --> 00:00:01.940
So into transcoding, I mean, I could

00:00:01.940 --> 00:00:03.700
probably add just a keyframe in the start

00:00:03.700 --> 00:00:06.700
and then just skip iFrames in the rest of the scenes.

Contributing

See contributing

Support

Join our community on Slack where you can post any questions regarding any of our open source projects. Eyevinn's consulting business can also offer you:

  • Further development of this component
  • Customization and integration of this component into your platform
  • Support and maintenance agreement

Contact [email protected] if you are interested.

About Eyevinn Technology

Eyevinn Technology is an independent consultant firm specialized in video and streaming. Independent in a way that we are not commercially tied to any platform or technology vendor. As our way to innovate and push the industry forward we develop proof-of-concepts and tools. The things we learn and the code we write we share with the industry in blogs and by open sourcing the code we have written.

Want to know more about Eyevinn and how it is to work here. Contact us at [email protected]!

auto-subtitles's People

Contributors

dependabot[bot] avatar oscnord avatar saelmala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

auto-subtitles's Issues

Add to video

Great work!
I see that it takes audio as input for generating the subtitles.

I'm hoping if there is possibility of using this for full video subtitle.
The workload would be to convert the video to audio, generate the subtitles and then add it to the original video

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.