voxelcubes / panelcleaner Goto Github PK

View Code? Open in Web Editor NEW

172.0 172.0 15.0 40.99 MB

An AI-powered tool to clean manga panels.

License: GNU General Public License v3.0

Python 99.27% Makefile 0.46% Dockerfile 0.04% Batchfile 0.22%

ai cli image-processing machine-learning manga text-cleaner

panelcleaner's People

Contributors

Stargazers

Watchers

Forkers

master117 eberamp eric013 lucarioman puzzledshark amoehentai kianmeng userjhansen onatom1 kokica18

panelcleaner's Issues

Spanish OCR support

so I can help

BallonTranslator x PanelCleaner

You can also combine your project and the program from the author of comix-text-detector, in his program. Here is the link. You may be interested https://github.com/dmMaze/BallonsTranslator

Failed to load OCR model on Windows Prebuilt

This error is always shows up whenever I run the prebuilt Windows binary. I have downloaded the ocr model and copied it to the cache location.

Here's the log:

- Program Information -
Program: Panel Cleaner 2.3.0
Executing from C:\Users\<username>\AppData\Local\Temp\_MEI154802\pcleaner\gui\launcher.pyc
Log file is C:\Users\<username>\AppData\Roaming\pcleaner\cache\pcleaner.log
Config file is C:\Users\<username>\AppData\Roaming\pcleaner\pcleanerconfig.ini
Cache directory is C:\Users\<username>\AppData\Roaming\pcleaner\cache
- System Information -
Operating System: Windows 10
Machine: AMD64
Python Version: 3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)]
PySide (Qt) Version: 6.6.0
Available Qt Themes: windowsvista, Windows, Fusion
System locale: en_US
CPU Cores: 16
GPU: None (CUDA not available)

2024-03-01 15:27:40.095 | INFO     | pcleaner.gui.launcher:launch:73 - Using locale en_US.
2024-03-01 15:27:40.098 | DEBUG    | pcleaner.gui.launcher:launch:80 - Loaded built-in Qt translations for en_US.
2024-03-01 15:27:40.098 | DEBUG    | pcleaner.gui.launcher:launch:88 - Loaded built-in Qt base translations for en_US.
2024-03-01 15:27:40.099 | DEBUG    | pcleaner.gui.launcher:launch:98 - Loaded App translations for en_US.
2024-03-01 15:27:40.697 | DEBUG    | pcleaner.gui.mainwindow_driver:ensure_models_downloaded:318 - Text detector model already downloaded.
2024-03-01 15:27:40.697 | DEBUG    | pcleaner.gui.mainwindow_driver:start_initialization_worker:502 - Worker Thread cleaning cache
2024-03-01 15:27:40.697 | DEBUG    | pcleaner.gui.mainwindow_driver:start_initialization_worker:509 - Worker Thread loading OCR model.
2024-03-01 15:27:40.698 | DEBUG    | pcleaner.gui.mainwindow_driver:initialize_ui:185 - Purging missing profiles.
2024-03-01 15:27:40.698 | INFO     | pcleaner.gui.mainwindow_driver:initialize_profiles:735 - Found profiles: [('Default', None)]
2024-03-01 15:27:40.699 | DEBUG    | pcleaner.config:load_profile:961 - Loading profile None...
2024-03-01 15:27:40.699 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
2024-03-01 15:27:40.699 | DEBUG    | pcleaner.config:load_profile:968 - Loading builtin default profile
2024-03-01 15:27:40.722 | DEBUG    | pcleaner.gui.mainwindow_driver:load_current_profile:893 - Loading current profile.
2024-03-01 15:27:40.723 | DEBUG    | pcleaner.gui.profile_parser:set_profile_values:393 - Setting profile values
2024-03-01 15:27:40.729 | DEBUG    | pcleaner.gui.mainwindow_driver:initialize_analytics_view:568 - Loading included font from C:\Users\<username>\AppData\Local\Temp\_MEI154802\pcleaner\data\NotoMono-Regular.ttf
2024-03-01 15:27:40.730 | DEBUG    | pcleaner.gui.mainwindow_driver:initialize_analytics_view:571 - Loaded included font
2024-03-01 15:27:40.732 | DEBUG    | pcleaner.gui.mainwindow_driver:save_default_palette:125 - Placeholder color: #000000
2024-03-01 15:27:40.733 | INFO     | pcleaner.gui.mainwindow_driver:set_theme:147 - Using theme: breeze-dark
2024-03-01 15:27:40.765 | INFO     | pcleaner.gui.mainwindow_driver:set_theme:159 - Theme is dark: True
2024-03-01 15:27:40.945 | DEBUG    | pcleaner.gui.mainwindow_driver:post_init:367 - Char width: 6, columns: 74, required width: 444
2024-03-01 15:27:40.947 | DEBUG    | pcleaner.gui.mainwindow_driver:post_init:398 - Splitter sizes: [400, 918, 460]
2024-03-01 15:27:51.073 | CRITICAL | pcleaner.gui.mainwindow_driver:generic_worker_error:536 - Failed to load OCR model. OCR impossible, moderate cleaning impact.

Encountered error:
Traceback (most recent call last):

  File "urllib3\connectionpool.py", line 791, in urlopen

  File "urllib3\connectionpool.py", line 492, in _make_request

  File "urllib3\connectionpool.py", line 468, in _make_request

  File "urllib3\connectionpool.py", line 1097, in _validate_conn

  File "urllib3\connection.py", line 642, in connect

  File "urllib3\connection.py", line 783, in _ssl_wrap_socket_and_match_hostname

  File "urllib3\util\ssl_.py", line 471, in ssl_wrap_socket

  File "urllib3\util\ssl_.py", line 515, in _ssl_wrap_socket_impl

  File "ssl.py", line 517, in wrap_socket

  File "ssl.py", line 1075, in _create

  File "ssl.py", line 1346, in do_handshake

ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "requests\adapters.py", line 486, in send

  File "urllib3\connectionpool.py", line 845, in urlopen

  File "urllib3\util\retry.py", line 470, in increment

  File "urllib3\util\util.py", line 38, in reraise

  File "urllib3\connectionpool.py", line 791, in urlopen

  File "urllib3\connectionpool.py", line 492, in _make_request

  File "urllib3\connectionpool.py", line 468, in _make_request

  File "urllib3\connectionpool.py", line 1097, in _validate_conn

  File "urllib3\connection.py", line 642, in connect

  File "urllib3\connection.py", line 783, in _ssl_wrap_socket_and_match_hostname

  File "urllib3\util\ssl_.py", line 471, in ssl_wrap_socket

  File "urllib3\util\ssl_.py", line 515, in _ssl_wrap_socket_impl

  File "ssl.py", line 517, in wrap_socket

  File "ssl.py", line 1075, in _create

  File "ssl.py", line 1346, in do_handshake

urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "huggingface_hub\file_download.py", line 1232, in hf_hub_download

  File "huggingface_hub\utils\_validators.py", line 118, in _inner_fn

  File "huggingface_hub\file_download.py", line 1599, in get_hf_file_metadata

  File "huggingface_hub\file_download.py", line 417, in _request_wrapper

  File "huggingface_hub\file_download.py", line 452, in _request_wrapper

  File "huggingface_hub\utils\_http.py", line 258, in http_backoff

  File "requests\sessions.py", line 589, in request

  File "requests\sessions.py", line 703, in send

  File "huggingface_hub\utils\_http.py", line 63, in send

  File "requests\adapters.py", line 501, in send

requests.exceptions.ConnectionError: (ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)), '(Request ID: 42b065a9-e571-4bfe-8539-7706262c7eda)')


The above exception was the direct cause of the following exception:


Traceback (most recent call last):

  File "transformers\utils\hub.py", line 430, in cached_file

  File "huggingface_hub\utils\_validators.py", line 118, in _inner_fn

  File "huggingface_hub\file_download.py", line 1349, in hf_hub_download

huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.


The above exception was the direct cause of the following exception:


Traceback (most recent call last):

> File "pcleaner\gui\worker_thread.py", line 141, in run

  File "pcleaner\gui\mainwindow_driver.py", line 523, in load_ocr_model

  File "manga_ocr\ocr.py", line 14, in __init__

  File "transformers\models\auto\feature_extraction_auto.py", line 339, in from_pretrained

  File "transformers\feature_extraction_utils.py", line 498, in get_feature_extractor_dict

  File "transformers\utils\hub.py", line 470, in cached_file

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like kha-white/manga-ocr-base is not the path to a directory containing a file named preprocessor_config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

is it possible to export the text as a txt file? (yes im really lazy)

memory usually not enough when batch processing more than 10 images

memory usually not enough when batch processing more than 10 images.
I have 64g of ram. When the process is starting to go into the Masker step, my RAM "commited memory" will exceed 65gb and thrown out error and crash my pc for a while.

I can't run on cuda

I have the newest cuda installed 12.2. My gpu is supported. But seem like I can't run on cuda. Please help me.

Add a gui

Not everyone is comfortable with the CLI or config files, so an accessible GUI should make all the available options more discoverable. The CLI shall remain fully functional for automation purposes, and a quiet flag could be a nice addition, in that case.

[DUPLICATE] [Windows] Unicode error with non-ascii letters in a file path

Processing error message after bulk

i got this error message after putting a thousand of images into the panel cleaner and i got this error message. Here are my specs;
i5-8300H
Intel UHD Graphics 630
GTX 1050ti
24 GB ram
about 400 GB HDD free storage.

everything looked fine while it was running except the cpu going to 100% (i had my task manager open beside it). the ram stayed at 12GB i think, but it says "no space left on device". so what happened here?

pip install error

have the following error installing pcleaner on fedora 39

Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
main()
File "/usr/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-2mwq_b_4/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-2mwq_b_4/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-2mwq_b_4/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 480, in run_setup
super().run_setup(setup_script=setup_script)
File "/tmp/pip-build-env-2mwq_b_4/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "", line 15, in
File "/tmp/pip-install-dz2ghw9u/fugashi_7cbb0607515249439cc2f16d9ebb1123/fugashi_util.py", line 58, in check_libmecab
raise RuntimeError("Could not configure working env. Have you installed MeCab?")
RuntimeError: Could not configure working env. Have you installed MeCab?
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.<

i installed mecab and have following error:
param.cpp(69) [ifs] no such file or directory: /usr/lib64/mecab/dic/ipadic/dicrc

[Windows] Unicode error with non-ascii letters in a file path

Encountered an error while processing files.

<class 'UnicodeDecodeError'>: Traceback (most recent call last):
File "pcleaner\gui\worker_thread.py", line 141, in run
File "pcleaner\gui\mainwindow_driver.py", line 1064, in generate_output
File "pcleaner\gui\processing.py", line 228, in generate_output
File "pcleaner\preprocessor.py", line 74, in prep_json_file
File "pathlib.py", line 1059, in read_text
File "encodings\cp1252.py", line 23, in decode
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 352: character maps to

'charmap' codec can't decode byte 0x9d in position 352: character maps to

Issue with docopt

When I try to run pcleaner (pcleaner --help), I get the following docopt related error message:
Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "C:\Users\myuser\Desktop\python\Scripts\pcleaner.exe\__main__.py", line 7, in <module> File "C:\Users\myuser\Desktop\python\Lib\site-packages\pcleaner\main.py", line 142, in main args = docopt(__doc__, version=f"{__version__}", more_magic=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\myuser\Desktop\python\Lib\site-packages\docopt\__init__.py", line 877, in docopt assert instr.opname.startswith("CALL_")

Environment
Windows 10 (no WSL)
Python 3.11.3
docopt 0.6.2
pcleaner 1.7.3
python-magic-bin 0.4.14 (can't use python-magic because of this issue)

The python version of docopt doesn't seem to be maintained anymore. The latest release is from 2014 and the last commit is from 2018 so the issue might be related to my python version.

I'm using this workaround for now.

line 116
from docopt import docopt

line 142
args = docopt(__doc__, version=f"Panel Cleaner {__version__}")

EDIT 2023-05-29:
Also change pcleaner/structures.py

Replace lines 78-80 below:
metadata = magic.from_file(self.image_path)
size_str = re.search(r"(\d+) x (\d+)", metadata).groups()
self._image_size = (int(size_str[0]), int(size_str[1]))

With:
self._image_size = Image.open(self.image_path).size

mac os metal

Can yo add support metal?
https://developer.apple.com/metal/pytorch/

Error Traceback (most recent call last)

Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Scripts\pcleaner.exe_main.py", line 4, in
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\pcleaner\main.py", line 124, in
import pcleaner.masker as cl
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\pcleaner\masker.py", line 6, in
import pcleaner.image_ops as ops
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\pcleaner\image_ops.py", line 11, in
import pcleaner.structures as st
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\pcleaner\structures.py", line 9, in
import magic
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\magic_init.py", line 209, in
libmagic = loader.load_lib()
^^^^^^^^^^^^^^^^^
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\site-packages\magic\loader.py", line 49, in load_lib
raise ImportError('failed to find libmagic. Check your installation')
ImportError: failed to find libmagic. Check your installation

Error while downloading models

I'm using Fedora 39, using the binaries provided (I could not get it working using pip nor docker, they gave dependency errors). Everything works until I click OK and proceed with model downloading.

Logs after launching:

[salt@salt PanelCleaner]$ ./PanelCleaner 
torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
torch/_jit_internal.py:857: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x7f740c992700>.
  warnings.warn(
torch/_jit_internal.py:857: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x7f740c993740>.
  warnings.warn(
[I 231129 13:25:39 launcher:32] 
    ---- Starting up ----
[I 231129 13:25:39 launcher:51] 
    - Program Information -
    Program: Panel Cleaner 2.1.0
    Log file is /home/salt/.cache/pcleaner/pcleaner.log
    Config file is /home/salt/.config/pcleaner/pcleanerrc
    Cache directory is /home/salt/.cache/pcleaner
    - System Information -
    Operating System: Linux 6.5.11-300.fc39.x86_64
    Machine: x86_64
    Python Version: 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801]
    PySide (Qt) Version: 6.5.3
    Available Qt Themes: Windows, Fusion
    CPU Cores: 8
    GPU: None (CUDA not available)
    
Gtk-Message: 13:25:39.641: Failed to load module "canberra-gtk-module"
Gtk-Message: 13:25:39.641: Failed to load module "pk-gtk-module"
[E 231129 13:25:44 model_downloader_driver:181] HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /zyddnys/manga-image-translator/releases/download/beta-0.3/comictextdetector.pt.onnx (Caused by SSLError(SSLError(0, 'unknown error (_ssl.c:3098)')))
    NoneType: None
[E 231129 13:25:44 model_downloader_driver:117] Failed to download all models.
[C 231129 13:25:44 mainwindow_driver:310] Failed to download models. Aborting.
[I 231129 13:25:44 launcher:72] ---- Shutting down ----

Problems running on Windows

Can't use program on average, something just broken along the way
Installed via PyPi, added extra with pip install -r requirements.txt from original github repo, but.... idk just doesent work???

GUI:

Normal usecase (?):

Magic ver(cuz ig its important or smth):

Permission denied when i try to use models, also cuda don't work

Tried to add all rights to users, also tried to launch by administrator.

Also i can't use my cuda, i have 40x rtx card. Downloaded cuda toolkit and pytorch.

Models

Running Denoiser... (Stuck)

Hi . after testing this on windows and not working i went to linux . after i installed it in linux i had some problems and they were all about the requirements versions which i found the right version and everything seems to go fine until the last step when it does this :

abder@abder-X75A1:~/Downloads$ /home/abder/.local/bin/pcleaner clean index.jpeg
Found 1 image.
Running text detection AI model...
Using device for text detection model: cpu
Using 1 processes for text detection.
100%|█████████████████████████████████████████████| 1/1 [00:13<00:00, 13.91s/it]


Running box data Pre-Processor...
2023-05-31 15:51:41.961 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
2023-05-31 15:51:53.992 | INFO     | manga_ocr.ocr:__init__:22 - Using CPU
2023-05-31 15:51:57.097 | INFO     | manga_ocr.ocr:__init__:29 - OCR ready
100%|█████████████████████████████████████████████| 1/1 [00:03<00:00,  3.72s/it]

OCR Analytics
-------------
Number of boxes: 2 | Number of small boxes: 2 (100%)
Number of removed boxes: 0 (0% total, 0% of small boxes)

Small box sizes:
   0- 500:  0 / 0
 500-1000:  0 / 0
1000-1500:  0 / 0
1500-2000:  0 / 0
2000-2500: ███████████████████████████████████████████████████████ 0 / 2
2500-3000:  0 / 0

█ Small boxes | █ Removed boxes

Removed bubbles:


Running Masker...
100%|█████████████████████████████████████████████| 1/1 [00:01<00:00,  1.64s/it]

Mask Fitment Analytics
----------------------
Total boxes: 2 | Masks succeeded: 0 (0%) | Masks failed: 2
Perfect masks: 0 (N/A) | Average border deviation: N/A

Mask usage by mask size (smallest to largest):
Box mask:  0 / 0

█ Perfect | █ Total

Pages with failures / total:
index.png: 2 / 2


Running Denoiser...
  0%|                                                     | 0/1 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/denoiser.py", line 29, in denoise_page
    cleaned_image.paste(mask_image, (0, 0), mask_image)
  File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1557, in paste
    self.im.paste(im, box, mask.im)
ValueError: bad transparency mask
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/abder/.local/bin/pcleaner", line 8, in <module>
    sys.exit(main())
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/main.py", line 219, in main
    run_cleaner(
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/main.py", line 418, in run_cleaner
    for analytic in tqdm(pool.imap(dn.denoise_page, data), total=len(data)):
  File "/home/abder/.local/lib/python3.10/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
ValueError: bad transparency mask

and here it is with another image :

abder@abder-X75A1:/media/abder/Stuffs/Manga/Blade of the Immortal (1997-2015) (Digital) (danke-Empire)/Blade of the Immortal v01 - Blood of a Thousand (1997) (Digital) (danke-Empire)$ /home/abder/.local/bin/pcleaner clean im7.jpg
Found 1 image.
Running text detection AI model...
Using device for text detection model: cpu
Using 1 processes for text detection.
100%|█████████████████████████████████████████████| 1/1 [00:12<00:00, 12.52s/it]


Running box data Pre-Processor...
2023-05-31 15:54:55.353 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
2023-05-31 15:55:00.636 | INFO     | manga_ocr.ocr:__init__:22 - Using CPU
2023-05-31 15:55:02.815 | INFO     | manga_ocr.ocr:__init__:29 - OCR ready
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 478.36it/s]

OCR Analytics
-------------
Number of boxes: 2 | Number of small boxes: 0 (0%)
Number of removed boxes: 0 (0% total, N/A of small boxes)
No not-removed small boxes found.


Removed bubbles:


Running Masker...
100%|█████████████████████████████████████████████| 1/1 [00:02<00:00,  2.02s/it]

Mask Fitment Analytics
----------------------
Total boxes: 2 | Masks succeeded: 2 (100%) | Masks failed: 0
Perfect masks: 1 (50%) | Average border deviation: 0.06

Mask usage by mask size (smallest to largest):
Mask 0  :  0 / 0
Mask 1  :  0 / 0
Mask 2  :  0 / 0
Mask 3  :  0 / 0
Mask 4  :  0 / 0
Mask 5  :  0 / 0
Mask 6  :  0 / 0
Mask 7  :  0 / 0
Mask 8  :  0 / 0
Mask 9  :  0 / 0
Mask 10 : ████████████████████████████████████████████████████████ 1 / 1
Box mask: ████████████████████████████████████████████████████████ 0 / 1

█ Perfect | █ Total

Pages with failures / total:


Running Denoiser...
  0%|                                                     | 0/1 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/denoiser.py", line 29, in denoise_page
    cleaned_image.paste(mask_image, (0, 0), mask_image)
  File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1557, in paste
    self.im.paste(im, box, mask.im)
ValueError: bad transparency mask
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/abder/.local/bin/pcleaner", line 8, in <module>
    sys.exit(main())
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/main.py", line 219, in main
    run_cleaner(
  File "/home/abder/.local/lib/python3.10/site-packages/pcleaner/main.py", line 418, in run_cleaner
    for analytic in tqdm(pool.imap(dn.denoise_page, data), total=len(data)):
  File "/home/abder/.local/lib/python3.10/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
ValueError: bad transparency mask

fail to load profile

AttributeError: module 'scipy' has no attribute 'signal'

I have python 3.10.6 and scipy 1.8.0 installed.

`pcleaner clean 01.png

Running text detection AI model...
Using device for text detection model: cuda
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.15s/it]

Running box data Pre-Processor...
2023-02-25 14:20:22.003 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base
2023-02-25 14:20:25.343 | INFO | manga_ocr.ocr:init:19 - Using CUDA
2023-02-25 14:20:25.475 | INFO | manga_ocr.ocr:init:29 - OCR ready
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 888.62it/s]

OCR Analytics

Number of boxes: 11 | Number of small boxes: 0 (0%)
Number of removed boxes: 0 (0% total, N/A of small boxes)
No not-removed small boxes found.

Removed bubbles:

Running Cleaner...
0%| | 0/1 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/cleaner.py", line 53, in clean_page
mask_fitments: list[st.MaskFittingResults] = [
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/cleaner.py", line 54, in
ops.pick_best_mask(
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/image_ops.py", line 321, in pick_best_mask
masks = make_mask_steps_convolution(
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/image_ops.py", line 171, in make_mask_steps_convolution
padded_mask = scipy.signal.convolve2d(padded_mask, kernel, mode="same")
AttributeError: module 'scipy' has no attribute 'signal'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/argyll/.local/bin/pcleaner", line 8, in
sys.exit(main())
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/main.py", line 193, in main
run_cleaner(
File "/home/argyll/.local/lib/python3.10/site-packages/pcleaner/main.py", line 333, in run_cleaner
for analytic in tqdm(pool.imap(cl.clean_page, data), total=len(data)):
File "/home/argyll/.local/lib/python3.10/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
AttributeError: module 'scipy' has no attribute 'signal'
`

Add Docker support

Want to contribute by adding Docker support, can I get permissions to push?

The advantage to run the app in a container is that you install only the dependencies required by said application and also avoid conflicting dependencies and versions of Python (for those that require a specific python installation)

Unable to use the program (Windows)

It's probably just a problem that comes from me since I'm not very familiar with using codes from github.
I downloaded pcleaner and installed it with the pip command, but when I try to use a command in the cmd prompt it says "invalid syntax", than if i try to import it or use it in python it says " ModuleNotFoundError: No module named 'pcleaner' ". If i try to install it again it says "Requirement already satisfied" a bunch of times. I'm probably missing something obvious but i don't know what it is.

Add CSV OCR Output

Hi,

I think it would be useful to get a csv file out of the ocr mode for identifying where the text is and to make it easier to work with.

I will work on implementing this and there should be a pull request soon, the csv output should follow this pattern:

filename	startx	starty	endx	endy	text

erorr data class

@DataClass
^^^^^^^^^
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 1223, in dataclass
return wrap(cls)
^^^^^^^^^
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 1213, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ANIX\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'pcleaner.config.TextDetectorConfig'> for field text_detector is not allowed: use default_factory

I hope to have a Chinese version. Thank you for your efforts.

Additional text added outside of text inside of images.

Panel Cleaner's OCR has an issue with adding text that is nonexistent to the original image file.

It seems to add them at the tail end of the sentence sequence.

This is what Panel Cleaner's output straight from the text file:

102-2.png:
まとめてーぶっ飛ばせー！
こんななんでもありバトルロイヤルビーチフラッグだとは聞いてませんよ！？
あはは！だから面白いんじゃない♪
ちなみに賞品はなんです？
なんと～！指揮官とのデート券！！
そうなんですか！チーム戦だから３枚？
残念～！１枚！お宝は早い者勝ちだかんね！
そんな殺生な！？
優勝チーム内で更にバトルです．．．？
ありがとうございました。はいはい。ですが、そのためにこの時期があります。

This should be the expected output:

102-2.png:
まとめてーぶっ飛ばせー！
こんななんでもありバトルロイヤルビーチフラッグだとは聞いてませんよ！？
あはは！だから面白いんじゃない♪
ちなみに賞品はなんです？
なんと～！指揮官とのデート券！！
そうなんですか！チーム戦だから３枚？
残念～！１枚！お宝は早い者勝ちだかんね！
そんな殺生な！？
優勝チーム内で更にバトルです．．．？

That last sentence string shouldn't even be there. I verified the text count using the isolated text function. It shows nine text boxes but has a tenth sentence added to it.

Support more image formats

The world doesn't always work with jpeg or png, a few more would be in order.

pip install error on Windows

I'm trying to install from pip, but I always run into this error:

  Building wheel for fugashi (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for fugashi (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      WARNING setuptools_scm.pyproject_reading toml section missing 'pyproject.toml does not contain a tool.setuptools_scm section'
      running bdist_wheel
      running build
      running build_py
      creating build\lib.win-amd64-cpython-312
      creating build\lib.win-amd64-cpython-312\fugashi
      copying fugashi\cli.py -> build\lib.win-amd64-cpython-312\fugashi
      copying fugashi\__init__.py -> build\lib.win-amd64-cpython-312\fugashi
      running build_ext
      cythoning fugashi/fugashi.pyx to fugashi\fugashi.c
      building 'fugashi.fugashi' extension
      creating build\temp.win-amd64-cpython-312
      creating build\temp.win-amd64-cpython-312\Release
      creating build\temp.win-amd64-cpython-312\Release\fugashi
      "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.39.33519\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\mecab -IC:\Users\kekes\AppData\Local\Programs\Python\Python312\include -IC:\Users\kekes\AppData\Local\Programs\Python\Python312\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.39.33519\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /Tcfugashi\fugashi.c /Fobuild\temp.win-amd64-cpython-312\Release\fugashi\fugashi.obj
      fugashi.c
      fugashi\fugashi.c(750): fatal error C1083: Cannot open include file: 'mecab.h': No such file or directory
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.39.33519\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for fugashi
Failed to build fugashi
ERROR: Could not build wheels for fugashi, which is required to install pyproject.toml-based projects

I have installed mecab and mecab-python3, and when I tried to install fugashi with pip install fugashi, i run into the same error.

can you set up crowdin/weblate for localisation

like i said in the title i wanna translate dis so can you set up something like that?

Support Inpainting

How about replacing cleaning with inpainting?
like llamacleaner
would be very useful for removing text which is not in the bubble and even in the bubble if it is somehow colored or textured.

English OCR support

Is it possible to add support for English OCR? Tesseract or EasyOCR.

Issue with Docopt

Discussed in #25

^{Originally posted by R3ck1e November 7, 2023}
Can't run on Mac. Full trace:
Last login: Tue Nov 7 18:57:25 on ttys001
/Users/romangoncarov/Library/Python/3.9/bin/pcleaner ; exit;
romangoncarov@mbp-roman ~ % /Users/romangoncarov/Library/Python/3.9/bin/pcleaner ; exit;
/Users/romangoncarov/Library/Python/3.9/lib/python/site-packages/urllib3/init.py:34: NotOpenSSLWarning: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: urllib3/urllib3#3020
warnings.warn(
Traceback (most recent call last):
File "/Users/romangoncarov/Library/Python/3.9/bin/pcleaner", line 5, in
from pcleaner.main import main
File "/Users/romangoncarov/Library/Python/3.9/lib/python/site-packages/pcleaner/main.py", line 107, in
from docopt import magic_docopt
ImportError: cannot import name 'magic_docopt' from 'docopt' (/Users/romangoncarov/Library/Python/3.9/lib/python/site-packages/docopt/init.py)

Saving session...
...copying shared history...
...saving history...truncating history files...
...completed.

//
Can't figure out what's wrong. The same error on Python 3.12. pip version is 23.2.1

I could not use Cuda

I have a RTX3060 but I couldn't use cuda as the core text recognizer, please help me if you can.