Giter Club home page Giter Club logo

mangaquick's Introduction

MangaQuick: Automatic Manga Translator

MangaQuick: Automatic Manga Translator

Description

MangaQuick is a Streamlit-powered web application, designed to facilitate the automatic translation of manga. This tool is part of my Final Degree Project Diseño y desarrollo de un traductor de comics (UPM, Spanish). It offers a streamlined solution for translating manga pages, with support for both single-page and batch processing. The application integrates Manga Text Segmentation for text segmentation and detection, LaMa for image inpainting and manga-ocr for optical character recognition.

Installation

Prerequisites

It's highly recommended to use a virtual environment for managing dependencies and isolating the project, conda is a great tool for this purpose:

Create a new conda environment named 'MangaQuick' with Python 3.11

conda create --name MangaQuick python=3.11

Activate the 'MangaQuick' environment

conda activate MangaQuick

Step-by-Step Installation

  1. Clone the MangaQuick repository:

    git clone https://github.com/yourusername/MangaQuick.git
  2. Navigate to the MangaQuick directory:

    cd MangaQuick
  3. Install the required dependencies:

    pip install -r requirements.txt

GPU Support

To utilize GPU, ensure you install the correct version of PyTorch that matches your system and CUDA setup. You can find the appropriate installation commands on:

https://pytorch.org/get-started/locally/

This application has been tested on an RTX 3080 GPU, which has 10GB of VRAM. It's important to note that the application nearly utilizes the full capacity of the 10GB VRAM. Therefore, to ensure smooth operation, a GPU with at least 10GB of VRAM is recommended.

The application supports CPU usage as well, with options to select either CPU or GPU for each different model within the web interface. The Text Segmentation model is the most resource-intensive component.

Text segmentation model

To download the Text Segmentation model, visit the GitHub repository. The repository offers 5 model variants; you may download one or all to switch between them in the web application.

Create a models folder inside components/text_detection and place the downloaded .pkl model file(s) inside it following this directory structure:

components/text_detection/models/fold.0.-.final.refined.model.2.pkl

LaMa model

Download the LaMa inpainting model from its GitHub page using the following commands:

curl -LJO https://huggingface.co/smartywu/big-lama/resolve/main/big-lama.zip
unzip big-lama.zip

Create a models folder inside components/image_inpainting and move the big-lama folder into it, resulting in the following path: components/image_inpainting/models/big-lama

Usage

To start using MangaQuick, follow these steps:

  1. Launch the application:
    streamlit MangaQuick.py

Upon launching, you will see the MangaQuick web interface in your browser:

Streamlit page (source: manga109, © Yagami Ken)

Main Features

  • Text segmentation: Select the preferred model and the processing unit, either GPU ("cuda") or CPU ("cpu"), to fit your hardware capabilities.
  • Text block detection: options for mask dilation and the removal of unnecessary text blocks, particularly useful for reducing false positives.
  • OCR: Select either GPU ("cuda") or CPU ("cpu").
  • Translation: Enter your DeepL API key and select the desired target language to translate the manga into your preferred language.
  • Inpainting: select either GPU ("cuda") or CPU ("cpu").
  • Text injection: Choose the appropriate font size and style. Note you need to match the font style with the target language for a coherent look.

DeepL

To store your DeepL key, create a .env file and include the following line:

DEEPL_KEY=<your_deepl_key>

Modifying Detection Boxes

  • Activate the Modify text boxes option to enable editing.
  • Within the user interface, adjusting detection boxes is straightforward: simply double-click on any box you wish to exclude. This feature is particularly useful for eliminating unnecessary or incorrect detections.
  • The functionality is focused solely on the removal of boxes; additional modifications to the boxes are not supported.

Streamlit modify (source: manga109, © Yagami Ken)

  1. All processing steps are executed simultaneously. Therefore, to adjust detection boxes or make any other changes, ensure you make these selections before initiating the process by clicking on the "Process Files" button.

  2. When multiple files are uploaded, they are processed collectively, not individually. This means that all images undergo each stage—starting with text segmentation, followed by text block detection, and so on—sequentially as a batch, rather than processing each image from start to finish before moving on to the next. This batch-processing approach means that you can adjust text boxes for all uploaded images simultaneously.

  3. Once the images are processed, you can download the translated manga as a zip file, ready for reading in your chosen language.

Acknowledgments

mangaquick's People

Contributors

dcy1117 avatar a-lgil avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.