k-rt-dev / vgt Goto Github PK

View Code? Open in Web Editor NEW

68.0 3.0 5.0 6.76 MB

Program to translate Japanese text through image recognition and GPT 3.5

License: MIT License

Python 12.66% JavaScript 86.95% CSS 0.39%

japanese manga ocr translator visual-recognition davinci-003 gpt-3 visual-novel python antd

vgt's People

Contributors

Stargazers

Watchers

Forkers

ovpn-dev rdarge nguyenquivinhquang iveskins hashim-k

vgt's Issues

In a Dual Monitor setup. Screenshot will actually grab the other monitor.

In my specific situation. My main monitor is on the right. I press ctrl + t to grab and highlight text on my main monitor. What shows up on the mode 1 page is actually my left monitor.

Edit raw text translation

In practice I find that the manga-ocr sometimes fudges the transcription if there's slight glare or blur in the scanned characters. I'd like to be able to edit the text it outputs to make simple corrections before submitting it for translation.

This is what I've whipped up in my fork, which I think would be useful for others too:
translation-edit.webm

GPT Error

In the box with translated text it just says "Error"
I have tried using different API keys but it does not work

Can't capture screen shot

Hi, when I try to capture screenshot it show an error like this

[1] Error occurred in handler for 'captureScreenshot': TypeError: Error processing argument at index 2, conversion failure from 
[1]     at node:electron/js2c/browser_init:177:945
[1]     at new Promise (<anonymous>)
[1]     at Object.t.getSourcesImpl (node:electron/js2c/browser_init:177:697)
[1]     at Object.getSources (node:electron/js2c/browser_init:45:277)
[1]     at C:\Users\LENOVO\Desktop\ImageDetect\TranslateManga\VGT-master\electron\handlers\ipcHandler.js:138:45
[1]     at node:electron/js2c/browser_init:201:579
[1]     at Object.<anonymous> (node:electron/js2c/browser_init:165:10272)
[1]     at Object.emit (node:events:394:28)

can you help me

Feature request: Linux Compatibility

Referenced in #1 , there are some changes that need to be made to allow for non-windows platforms to compile and run VGT. I briefly had it working on both Linux and Mac, but since I no longer have access to Apple hardware, I'd like to focus on Linux compatibility for now.

From my own experimentation, it seems like Wayland has some incompatibilities with Electron's/Chromium's desktopCapturer module, but xorg was able to take screenshots without any additional tinkering.

Additionally, some helper functions for determining screen point location are not available cross-platform, so a different way to calculate the screenshot positions/dimensions will be necessary.
See changes here: rDarge@e44e925
and the corresponding scale factor fix here: rDarge@579a13e

Lastly, the window names are slightly different platform-to-platform, so the default window name corresponding to each OS needs to be recognized by the backend to prevent the app from selfdestructing:
rDarge@7f9ee01

Add a license?

This is really cool! I'd like to contribute, but there's no license established in the repo so far. Is collaboration something you're open to, or are you not sure where you'd like to go with the project?

Feature Request: Multiple Monitors

It would be nice to be able to have VGT open on one monitor and the manga open on another monitor - currently VGT only supports one screen (the primary screen) for screenshotting.

Either specifying a particular monitor to focus on or, preferably, creating a capture window on each window when the capture shortcut is pressed (so that no additional configuration is necessary) would be a great addition to the features VGT currently offers.

Feature Request: Resubmit only translation

Occasionally OpenAI's API will be down - when this happens if I submit a new clip to be translated it just returns "Error". I'd like to be able to hit a button to resubmit just the last step of the translation process to the API, rather than have to make a new sample and OCR again.

Something like the following:
output.webm

You mentioned you would be overhauling the options for translation services here, so maybe you've already done something like this; otherwise I think it would be a good addition even across different translation services.

On first startup, OpenAI key input field is 1 character too short

I've observed that when starting VGT for the first time, one of the characters of my key will get truncated after I enter it on the pop-up modal, and subsequent translation requests will fail until this is corrected. I've fixed this locally on my fork (rDarge@bcc1929), and it'd be nice to have it fixed in the OG repo too.

Extending to manhwa

Hi~ I was about to embark on creating something similar for manhwa's and then I found this very nice project.
backend-wise it should be really easy to extend this, for example using pytesseract.
I would love to create anything you need for the backend, I've created a local version of manga-ocr that uses pytesseract and its simple enough, I just don't really know how to embed it into your project as it is a bit more involved.

using pytesseract it should be possible to extend to other languages as well, easily. of course, its not as good as manga-ocr's ocr (I couldn't found good databases that I could use to copy their approach) but setting pytesseract as a default when there isn't anything better should be great :)

so, basically, tell me how to contribute and I'll have a pull request ready in no time ;p

How do I cleanly delete it?

It does not support delete programs. Where were the files saved during installation?

k-rt-dev / vgt Goto Github PK

vgt's People

Contributors

Stargazers

Watchers

Forkers

vgt's Issues

In a Dual Monitor setup. Screenshot will actually grab the other monitor.

Edit raw text translation

GPT Error

Can't capture screen shot

Feature request: Linux Compatibility

Add a license?

Feature Request: Multiple Monitors

Feature Request: Resubmit only translation

On first startup, OpenAI key input field is 1 character too short

Extending to manhwa

How do I cleanly delete it?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent