Giter Club home page Giter Club logo

gpt-subb's Introduction

GPT Subb

gpt-subb is a command-line tool to translate and convert subtitles using OpenAI's Chat-GPT language model

First things first

  • This tool is not intended to be used commercially.
  • Only translate subtitles of files/videos that you legally own.
  • Do not use this tool to publish unauthorized translations of media you don't have rights to.
  • Yes, a portion of this code (and also the documentation) was written by Chat-GPT itself.

I made this tool part as a joke and part as a experiment out of an idea given by a friend. Feel free to contact me if you want a more professional software to this purpose.

We also won't refuse a couple of beer donations if you find this useful somehow. 🍻

Installation

To install gpt-subb, you need to have Node.js and npm (Node Package Manager) installed on your system. Once you have these installed, you can run the following command in your terminal:

npm install -g gpt-subb

This will install the tool globally on your system, making it available for use in any directory.

Usage

Prerequisites

To use the tool, you will need an OpenAI API key, which can be obtained at https://platform.openai.com/account/api-keys.

Command

The gpt-subb command requires an input file to be translated as a mandatory argument and an output file as an optional argument. The tool will create the output file with the translated subtitles. If no output file is specified, the tool will create a new file in the same directory with the same name as the input file and the language code appended to the basename.

gpt-subb <input-file> [output-file]

Options

  • -k, --key <key>: OpenAI API Key. You can get one here. Required.
  • -b, --batch-size <number>: Maximum number of messages to be sent in a single prompt. Default is 15. = -l, --language <language>: Language code to be used in the translation. Default is en-us.
  • -p, --prompt <prompt>: Prompt format to be sent to the OpenAI API. Default format is Translate the following text into [lang] but keep the 6 digit codes between < > intact:\n\n[text].
  • -f, --format <output-format>: Format of the output file. Supported formats are SRT and WebVTT. Default is SRT.
  • -m, --model <model>: OpenAI Model to be used for the translation. Default is gpt-3.5-turbo.
  • --temperature <number>: OpenAI Temperature to be used for the translation. Default is 0.4.

Environment variables

Any option of this tool can be defined through environment variable or a .env file placed in the directory from where you are running the command.

Example:

GPTSUBB_KEY="ABC-123-XYZ"
GPTSUBB_LANGUAGE="en-us"

Examples

Translate the input.srt file to Portuguese (pt-br) and save the result to output.srt file:

gpt-subb -k YOUR_API_KEY -l pt-br input.srt output.srt 

Translate the movie.srt file to French (fr) and save the result to movie.fr.vtt file:

gpt-subb -l fr -f WebVTT movie.srt

gpt-subb's People

Contributors

skyatura avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

gpt-subb's Issues

Use completion api over chat api

Chat API is not the best fit for translating only tasks, the tool should use Completion API which has more fine grained options to consistent results.

There is also a new candidate API for replacing this, which is Edit API. This is still Beta but we may end up using it, since this is the exactly use case of it.

"Continue" mode

There must be an option to start from where a previous run stopped.

This is a follow up feature of #8

sometimes getting prompt failed.

Using Arch Linux, node 16, command gpt-subb -m gpt-3.5-turbo -l tr -f SRT tesst/[Judas]\ High\ School\ DxD\ S1\ -\ 01.srt

  prompt: 'Translate the following text into tr but keep the 6 digit codes between < > intact:\n' +
    '\n' +
    '<YRD613>\n' +
    '<font face="ITC Stone Sans Std Medium" size="52"><b>A-A devil?</b></font>\n' +
    '<000000>\n' +
    '<XJK848>\n' +
    '<font face="ITC Stone Sans Std Medium" size="52"><b>And your master.</b></font>\n' +
    '<000000>\n' +
    '<JIQ772>\n' +
    '<font face="ITC Stone Sans Std Medium" size="52"><b>Nice to have you, Hyoudou Issei-kun.</b></font>\n' +
    '<000000>\n' +
    '<IFM795>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#fdfdfd"><font color="#f91600">High School DxD</font></font></font>\n' +
    '<000000>\n' +
    '<EQX360>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#ff5169"><font color="#f91600">High School DxD</font></font></font>\n' +
    '<000000>\n' +
    '<ZBX292>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#ff5064">High School DxD</font></font>\n' +
    '<000000>\n' +
    '<IBN377>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#ff5300">High School DxD</font></font>\n' +
    '<000000>\n' +
    '<LUI665>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#f19498">High School DxD</font></font>\n' +
    '<000000>\n' +
    '<EAA572>\n' +
    '<font face="Boopee" size="50" color="#d0d52a"><font color="#dcd0f8">High School DxD</font></font>\n' +
    '<000000>\n' +
    '<XGY981>\n' +
    '<font face="Aurulent Sans" size="50">The lines come together with a distant voice calling in the skies,</font>\n' +
    '<000000>\n' +
    '<ROG759>\n' +
    '<font face="Baar Sophia" size="50">{\\an8}</font><font face="ITC Stone Sans Std Medium" size="52"><b><font color="#78595b"><font color="#ffffff">ma</font></font></b></font>\n' +
    '<000000>\n' +
    '<OYX736>\n' +
    '<font face="Baar Sophia" size="50">{\\an8}</font><font face="ITC Stone Sans Std Medium" size="52"><b><font color="#78595b"><font color="#ffffff">ji</font></font></b></font>\n' +
    '<000000>\n' +
    '<FLZ693>\n' +
    '<font face="Baar Sophia" size="50">{\\an8}</font><font face="ITC Stone Sans Std Medium" size="52"><b><font color="#78595b"><font color="#ffffff">wa</font></font></b></font>\n' +
    '<000000>\n' +
    '<OUH089>\n' +
    '<font face="Baar Sophia" size="50">{\\an8}</font><font face="ITC Stone Sans Std Medium" size="52"><b><font color="#78595b"><font color="#ffffff">ri</font></font></b></font>\n' +
    '<000000>\n' +
    '<OJI981>\n' +
    '<font face="Baar Sophia" size="50">{\\an8}</font><font face="ITC Stone Sans Std Medium" size="52"><b><font color="#78595b"><font color="#ffffff">a</font></font></b></font>',
  error: Error: Request failed with status code 502
      at createError (/home/tbb/.nvm/versions/node/v16.20.0/lib/node_modules/gpt-subb/node_modules/openai/node_modules/axios/lib/core/createError.js:16:15)
      at settle (/home/tbb/.nvm/versions/node/v16.20.0/lib/node_modules/gpt-subb/node_modules/openai/node_modules/axios/lib/core/settle.js:17:12)
      at IncomingMessage.handleStreamEnd (/home/tbb/.nvm/versions/node/v16.20.0/lib/node_modules/gpt-subb/node_modules/openai/node_modules/axios/lib/adapters/http.js:322:11)
      at IncomingMessage.emit (node:events:525:35)
      at endReadableNT (node:internal/streams/readable:1358:12)
      at processTicksAndRejections (node:internal/process/task_queues:83:21) {
    config: {
      transitional: [Object],
      adapter: [Function: httpAdapter],
      transformRequest: [Array],
      transformResponse: [Array],
      timeout: 0,
      xsrfCookieName: 'XSRF-TOKEN',
      xsrfHeaderName: 'X-XSRF-TOKEN',
      maxContentLength: -1,
      maxBodyLength: -1,
      validateStatus: [Function: validateStatus],
      headers: [Object],
      method: 'post',
      data: '{"model":"gpt-3.5-turbo","messages":[{"content":"Translate the following text into tr but keep the 6 digit codes between < > intact:\\n\\n<YRD613>\\n<font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b>A-A devil?</b></font>\\n<000000>\\n<XJK848>\\n<font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b>And your master.</b></font>\\n<000000>\\n<JIQ772>\\n<font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b>Nice to have you, Hyoudou Issei-kun.</b></font>\\n<000000>\\n<IFM795>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#fdfdfd\\"><font color=\\"#f91600\\">High School DxD</font></font></font>\\n<000000>\\n<EQX360>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#ff5169\\"><font color=\\"#f91600\\">High School DxD</font></font></font>\\n<000000>\\n<ZBX292>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#ff5064\\">High School DxD</font></font>\\n<000000>\\n<IBN377>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#ff5300\\">High School DxD</font></font>\\n<000000>\\n<LUI665>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#f19498\\">High School DxD</font></font>\\n<000000>\\n<EAA572>\\n<font face=\\"Boopee\\" size=\\"50\\" color=\\"#d0d52a\\"><font color=\\"#dcd0f8\\">High School DxD</font></font>\\n<000000>\\n<XGY981>\\n<font face=\\"Aurulent Sans\\" size=\\"50\\">The lines come together with a distant voice calling in the skies,</font>\\n<000000>\\n<ROG759>\\n<font face=\\"Baar Sophia\\" size=\\"50\\">{\\\\an8}</font><font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b><font color=\\"#78595b\\"><font color=\\"#ffffff\\">ma</font></font></b></font>\\n<000000>\\n<OYX736>\\n<font face=\\"Baar Sophia\\" size=\\"50\\">{\\\\an8}</font><font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b><font color=\\"#78595b\\"><font color=\\"#ffffff\\">ji</font></font></b></font>\\n<000000>\\n<FLZ693>\\n<font face=\\"Baar Sophia\\" size=\\"50\\">{\\\\an8}</font><font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b><font color=\\"#78595b\\"><font color=\\"#ffffff\\">wa</font></font></b></font>\\n<000000>\\n<OUH089>\\n<font face=\\"Baar Sophia\\" size=\\"50\\">{\\\\an8}</font><font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b><font color=\\"#78595b\\"><font color=\\"#ffffff\\">ri</font></font></b></font>\\n<000000>\\n<OJI981>\\n<font face=\\"Baar Sophia\\" size=\\"50\\">{\\\\an8}</font><font face=\\"ITC Stone Sans Std Medium\\" size=\\"52\\"><b><font color=\\"#78595b\\"><font color=\\"#ffffff\\">a</font></font></b></font>","role":"user"}],"temperature":0.4}',
      url: 'https://api.openai.com/v1/chat/completions'
    },
    request: ClientRequest {
      _events: [Object: null prototype],
      _eventsCount: 7,
      _maxListeners: undefined,
      outputData: [],
      outputSize: 0,
      writable: true,
      destroyed: false,
      _last: true,
      chunkedEncoding: false,
      shouldKeepAlive: false,
      maxRequestsOnConnectionReached: false,
      _defaultKeepAlive: true,
      useChunkedEncodingByDefault: true,
      sendDate: false,
      _removedConnection: false,
      _removedContLen: false,
      _removedTE: false,
      strictContentLength: false,
      _contentLength: 2478,
      _hasBody: true,
      _trailer: '',
      finished: true,
      _headerSent: true,
      _closed: false,
      socket: [TLSSocket],
      _header: 'POST /v1/chat/completions HTTP/1.1\r\n' +
        'Accept: application/json, text/plain, */*\r\n' +
        'Content-Type: application/json\r\n' +
        'User-Agent: OpenAI/NodeJS/3.2.1\r\n' +
        'Authorization: Bearer (they were key )n' +
        'Content-Length: 2478\r\n' +
        'Host: api.openai.com\r\n' +
        'Connection: close\r\n' +
        '\r\n',
      _keepAliveTimeout: 0,
      _onPendingData: [Function: nop],
      agent: [Agent],
      socketPath: undefined,
      method: 'POST',
      maxHeaderSize: undefined,
      insecureHTTPParser: undefined,
      path: '/v1/chat/completions',
      _ended: true,
      res: [IncomingMessage],
      aborted: false,
      timeoutCb: null,
      upgradeOrConnect: false,
      parser: null,
      maxHeadersCount: null,
      reusedSocket: false,
      host: 'api.openai.com',
      protocol: 'https:',
      _redirectable: [Writable],
      [Symbol(kCapture)]: false,
      [Symbol(kBytesWritten)]: 0,
      [Symbol(kEndCalled)]: true,
      [Symbol(kNeedDrain)]: false,
      [Symbol(corked)]: 0,
      [Symbol(kOutHeaders)]: [Object: null prototype],
      [Symbol(kUniqueHeaders)]: null
    },
    response: {
      status: 502,
      statusText: 'Bad Gateway',
      headers: [Object],
      config: [Object],
      request: [ClientRequest],
      data: '<html>\r\n' +
        '<head><title>502 Bad Gateway</title></head>\r\n' +
        '<body>\r\n' +
        '<center><h1>502 Bad Gateway</h1></center>\r\n' +
        '<hr><center>cloudflare</center>\r\n' +
        '</body>\r\n' +
        '</html>\r\n'
    },
    isAxiosError: true,
    toJSON: [Function: toJSON]
  }
}
Translating queue 25 of 35... failed! (skipping)

[1.1.x] Refactor the entire tool

Since I first created this project, my understanding on OpenAI's API and LLMs got significantly improved. That being said, I think this deserve an update for implementing things I didn't knew back then, alongside with better error handling.

Goals for the new version

Multiline support is missing

When subtitles are on multiples lines, only the first one is written back in the subtitle file.

Translations queues:

  '<VPE875>\n- I like you.\n- I like you.',
  '<STX511>\nSome unexpected costs have come up.',
  '<YVT450>\n- What do you need?\n- $50,000.',
  '<GHN797>\nYou bet the money that my\nfather gave us to pay the lease?',
  "<VMJ674>\nRANJIT: It's gone, man. It's all gone.",
  "<DVU397>\nMALIKA: Why did Jack Hauss\nvote against the women's center",
  '<BPI409>\nwhen he told me, to my face,\nthat he would support it?',
[...]

srt file:

1
00:00:00,658 --> 00:00:02,704
- I like you.

2
00:00:08,927 --> 00:00:11,582
Some unexpected costs have come up.

3
00:00:11,626 --> 00:00:14,150
- What do you need?

4
00:00:14,413 --> 00:00:17,155
You bet the money that my

5
00:00:17,199 --> 00:00:18,766
RANJIT: It's gone, man. It's all gone.

6
00:00:18,809 --> 00:00:20,724
MALIKA: Why did Jack Hauss

7
00:00:20,768 --> 00:00:22,595
when he told me, to my face,

[...]

Budget limit

TL;DR: OpenAI is draining out my money while I attempt designing better prompts, and I need to pause for a while.


I finally took some time to improve this project, and I have many things in mind for it, as indicated by the issues I've tracked earlier today.

However, even though this is an open source project and I don't pretend to charge for it, developing with OpenAI's API can become really expensive in just a few minutes. Specially because subtitles are large files which consume many tokens for each run, and designing consistent prompts require a lot of trials.

This being said, since I already reached my monthly budget, I will need to pause the development until next billing cycle.

Better default prompt

A well designed prompt helps to get consistent result over many iterations.

The prompt must guarantee:

  • The cue reference must not be modified
  • The whole cue will be used in the translation
  • The cue line count must match the original one whenever possible
  • Surrounding cues could be used as context for better translation
  • The prompt could allow users to add additional instructions, like
    • Avoid bad words
    • Explain technical terms
    • Explain abbreviations

Bonus features

I will also experiment on things like adding cultural context on jokes and references that may be lost in translation

TypeError

I can't seem to get this working properly, neither in Windows, nor with Colab.
The output is the same in both, any pointers towards the right direction are greatly appreciated.

Converting translated cues to objects...
TypeError: undefined is not iterable (cannot read property Symbol(Symbol.iterator))
    at /usr/local/lib/node_modules/gpt-subb/dist/index.js:167:33
    at Array.map (<anonymous>)
    at translatedToCues (/usr/local/lib/node_modules/gpt-subb/dist/index.js:166:29)
    at /usr/local/lib/node_modules/gpt-subb/dist/index.js:202:28
    at Generator.next (<anonymous>)
    at fulfilled (/usr/local/lib/node_modules/gpt-subb/dist/index.js:29:58)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Only the first line of multiline subtitles are translated.

Input:

12
00:03:06,733 --> 00:03:11,832

  • ¿Está a 2000 metros de altitud?
  • ¡No, doctor!

13
00:03:11,852 --> 00:03:15,871

  • ¡Quizá 1950!
  • ¿Se puede llegar allí con Jeep?

Output:

12
00:03:06,733 --> 00:03:11,832

  • 2000 metre yükseklikte mi?

13
00:03:11,852 --> 00:03:15,871

  • Belki 1950!

Output file stream

There must be a flag to write changes in the output file as soon as each cue is translated.

This allows us to inspect the results in realtime, and also avoid the need to call the whole file again if some line needs to be fixed or something wen't wrong.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.