k-rt-dev / vgt Goto Github PK
View Code? Open in Web Editor NEWProgram to translate Japanese text through image recognition and GPT 3.5
License: MIT License
Program to translate Japanese text through image recognition and GPT 3.5
License: MIT License
In my specific situation. My main monitor is on the right. I press ctrl + t to grab and highlight text on my main monitor. What shows up on the mode 1 page is actually my left monitor.
In practice I find that the manga-ocr sometimes fudges the transcription if there's slight glare or blur in the scanned characters. I'd like to be able to edit the text it outputs to make simple corrections before submitting it for translation.
This is what I've whipped up in my fork, which I think would be useful for others too:
translation-edit.webm
Hi, when I try to capture screenshot it show an error like this
[1] Error occurred in handler for 'captureScreenshot': TypeError: Error processing argument at index 2, conversion failure from
[1] at node:electron/js2c/browser_init:177:945
[1] at new Promise (<anonymous>)
[1] at Object.t.getSourcesImpl (node:electron/js2c/browser_init:177:697)
[1] at Object.getSources (node:electron/js2c/browser_init:45:277)
[1] at C:\Users\LENOVO\Desktop\ImageDetect\TranslateManga\VGT-master\electron\handlers\ipcHandler.js:138:45
[1] at node:electron/js2c/browser_init:201:579
[1] at Object.<anonymous> (node:electron/js2c/browser_init:165:10272)
[1] at Object.emit (node:events:394:28)
can you help me
Referenced in #1 , there are some changes that need to be made to allow for non-windows platforms to compile and run VGT. I briefly had it working on both Linux and Mac, but since I no longer have access to Apple hardware, I'd like to focus on Linux compatibility for now.
From my own experimentation, it seems like Wayland has some incompatibilities with Electron's/Chromium's desktopCapturer module, but xorg was able to take screenshots without any additional tinkering.
Additionally, some helper functions for determining screen point location are not available cross-platform, so a different way to calculate the screenshot positions/dimensions will be necessary.
See changes here: rDarge@e44e925
and the corresponding scale factor fix here: rDarge@579a13e
Lastly, the window names are slightly different platform-to-platform, so the default window name corresponding to each OS needs to be recognized by the backend to prevent the app from selfdestructing:
rDarge@7f9ee01
This is really cool! I'd like to contribute, but there's no license established in the repo so far. Is collaboration something you're open to, or are you not sure where you'd like to go with the project?
It would be nice to be able to have VGT open on one monitor and the manga open on another monitor - currently VGT only supports one screen (the primary screen) for screenshotting.
Either specifying a particular monitor to focus on or, preferably, creating a capture window on each window when the capture shortcut is pressed (so that no additional configuration is necessary) would be a great addition to the features VGT currently offers.
Occasionally OpenAI's API will be down - when this happens if I submit a new clip to be translated it just returns "Error". I'd like to be able to hit a button to resubmit just the last step of the translation process to the API, rather than have to make a new sample and OCR again.
Something like the following:
output.webm
You mentioned you would be overhauling the options for translation services here, so maybe you've already done something like this; otherwise I think it would be a good addition even across different translation services.
I've observed that when starting VGT for the first time, one of the characters of my key will get truncated after I enter it on the pop-up modal, and subsequent translation requests will fail until this is corrected. I've fixed this locally on my fork (rDarge@bcc1929), and it'd be nice to have it fixed in the OG repo too.
Hi~ I was about to embark on creating something similar for manhwa's and then I found this very nice project.
backend-wise it should be really easy to extend this, for example using pytesseract.
I would love to create anything you need for the backend, I've created a local version of manga-ocr that uses pytesseract and its simple enough, I just don't really know how to embed it into your project as it is a bit more involved.
using pytesseract it should be possible to extend to other languages as well, easily. of course, its not as good as manga-ocr's ocr (I couldn't found good databases that I could use to copy their approach) but setting pytesseract as a default when there isn't anything better should be great :)
so, basically, tell me how to contribute and I'll have a pull request ready in no time ;p
It does not support delete programs. Where were the files saved during installation?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.