kha-white / manga-ocr Goto Github PK

View Code? Open in Web Editor NEW

1.4K 15.0 72.0 1.07 MB

Optical character recognition for Japanese text, with the main focus being Japanese manga

License: Apache License 2.0

Python 100.00%

ocr japanese manga transformers computer-vision deep-learning comics

manga-ocr's People

Contributors

Stargazers

Watchers

manga-ocr's Issues

error: legacy-install-failure

everytime i try to install this package, i get this one error

  """C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\Felix\AppData\LocildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" /EHsc /Tpsrc/sentencepiece/sentencepiece_wrap.cxx /Fobuild\temp.win-amd64-cpython-311\Release\src/sentencepiece/sentencepiece_wrap.obj /std:c++17 /MT /I..\build\root\include
  cl : Befehlszeile warning D9025 : "/MD" wird durch "/MT" \x81berschrieben
  sentencepiece_wrap.cxx
  src/sentencepiece/sentencepiece_wrap.cxx(2822): fatal error C1083:

File (include) cannot be opened "sentencepiece_processor.h": No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> sentencepiece

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure."""

i already saw the "File (include) cannot be opened "sentencepiece_processor.h" warning but ill get the same error when i try to reinstall sentencepiece

has someone and idea how i fix this problem and get this repo running?

caching of downloaded models

It would be nice if the downloaded models could be cached somewhere (with an optional argument).

All three AutoFeatureExtractor , AutoTokenizer and VisionEncoderDecoderModel classes have the same from_pretrained() method that allow the setting of a cache_dir optional argument.

This way, it wouldnt be necessary to re-download the model for each usage. Mokuro could also benefit from such a caching logic.

PS : Thanks for this amazing software :)

Does it work on other languages?

Like korean chinese etc.

[offtopic] Implementation of MangaOCR in a translation software that uses GTP3.5

Hello, I wanted to inform the team that I have implemented their system in a new open source software called Visual GPT Translator (VGT). This software provides us with a GUI to take screenshots, extract Japanese text from them (using MangaOCR), and translate it using the GPT 3.5 API.

I thought you could list it in the related projects section of your readme.

The GUI is based on ElectronJs and ReactJs, while also running a local server based on FastAPI. It is in this server where Manga-Ocr is implemented. The project also demonstrates how it is possible to use PyInstaller/Electron-Packager to compile both the GUI and server with the OCR into an executable.

Thank you very much for your work :)

stop after recognizing the code

Hello im trying to write a code to automate translation. I'm using the shell command to do this. The problem is, that since this program keeps running in the background, it puts my JS program on a hold until it hears something back from exited shell command. Would it be possible to put a flag that exits after recognizing the code?

training own data

Could you please provide details about the training and testing recognition module?

Manga_ocr OFFLINE (enhancement)

If I stand incorrect and the framework e.c.t is stored in the cache please still read this * part. Thanks.
Can't we all not just download the necessary transformers and such, Poricom can do it and utilizes exactly the same ingredients (manga_ocr) I love the simplicity of manga_ocr, Poricom is horrid UI really terrible; so I really appreciate manga_ocr just don't see the point of connecting to a sever to access this "Transformers' [Vision Encoder Decoder] framework" why can't all this jazz be put local. It makes more sense it should be "built in" but it's not because we all have to connect to K White's sever.

Would this also mean we could put special fonts and more power behind the transformers/framework, such as to handle handwriting?
Look forward to hearing from you. Thanks for manga_ocr long time fan.

Is it possible to release a binary with the models?

As titled.Planning to write an adaptation plugin of your amazing engine to an OCR tools.And I found installing and seperating the project itself from it's dependencies (which is essential to reduce the disk usage of the plugin) difficult.So I was wondering whether this is possible because I noticed the Running in Background in README.

error importing MangaOcr

Hi there, total noob here, I got to the point where I have to import MangaOcr but Terminal response is:

from: can't read /var/mail/manga_ocr

I've also tried with: import PIL.Image ... but no luck either.

Any clue?

Thanks a lot in advance

BTW I'm on an Intel PB

[Question] How long does it take to process on CPU cs GPU ?

Hi, thanks a lot for this nice repo. I starred it :)

I have a question that came up to my mind by looking at it :
Would it be ok to use this tool on a recent CPU (I7-8XXX) ? Have you a sort of mini benchmark that compares the inference speed between GPU(s) and CPU(s) ?

Thanks again !

Please help, how to make it work without internet connection

I am using Manga-ocr in Poricom, but constantly having issues with internet connection. I am getting error "Please try again or make sure your internet connection is on" with MangaOCR model.

Poricom / MangaOCR saves offline copy of the pretrained model in %UserProfile%.cache\huggingface\transformers. I checked the model is indeed there.

As I understand from issues#9 this is an issue with HuggingFace Transformers, even if the model is actually being stored in cache off-line, it still needs internet connection to load that model from it.

From what I understand I should set TRANSFORMERS_OFFLINE=1 and HF_DATASETS_OFFLINE=1 to run image recognition in a firewalled or offline environment by only using local files.

Can someone please help me to implement it in local instance of Manga-ocr such that I can then use it in local instance of Poricom offline. I have little experience and have no idea where else to ask. Would appreciate if someone helps me to figure out which files should I change to add those values and implement offline option.

Failed to initialize NumPy

Requirement already satisfied: numpy in c:\users\lenovo\appdata\local\programs\python\python310\lib\site-packages (1.22.3)
UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xf

Output not copied to clipboard on linux

Title says it all.
I am on fedora 37 with kde and wayland.

ViTFeatureExtractor is deprecated

Hi,

When the OCR model is being loaded on the first run, the following warning is shown:
FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.

Example of text which does not get recognized correctly

When I try to get the text from the following image:

manga-ocr gives me わたさんあたしが未満しててもそんなのおだまいないこいに山田太くってきちゃうところかわいい.

Using https://2ocr.com/online-ocr-japanese I get れなさんわたしが未読しててもそんなのおかまいなしにLINEおくってきちゃうところかわいい, which seems to be correct, or at least more correct than what manga-ocr gives me.

Feel welcome to change the title of this ticket to something more appropriate.

ImportError: DLL load failed while importing fugashi

I try to uninstall and reinstall 3.913version on the python official website, but still encounter this issue.

M1 GPU Support (MPS)

Add MPS device type so that M1/M2 gpus can be utilised

See: https://pytorch.org/docs/stable/notes/mps.html

if not torch.backends.mps.is_available():
    if not torch.backends.mps.is_built():
        print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
    else:
        print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")

else:
    mps_device = torch.device("mps")

    # Create a Tensor directly on the mps device
    x = torch.ones(5, device=mps_device)
    # Or
    x = torch.ones(5, device="mps")

    # Any operation happens on the GPU
    y = x * 2

    # Move your model to mps just like any other device
    model = YourFavoriteNet()
    model.to(mps_device)

    # Now every call runs on the GPU
    pred = model(x)

AttributeError: can't set attribute 'feature_extractor'

I get this error when running the train.py module.

[/content/manga_ocr/manga_ocr_dev/training/get_model.py](https://localhost:8080/#) in __init__(self, feature_extractor, tokenizer)
      7 
      8     def __init__(self, feature_extractor, tokenizer):
----> 9         self.feature_extractor = feature_extractor
     10         self.tokenizer = tokenizer
     11         self.current_processor = self.feature_extractor

AttributeError: can't set attribute 'feature_extractor'

How to deploy GPU for execution?

If you want to run with GPU, install PyTorch as described here, otherwise this step can be skipped.

I only see this sentence, but I don't understand how to install and run on an NVIDIA GPU. Can you teach me?

I don't have a coding background, so I don't know how to install it or how to run it after installation

Can you provide a version that directly utilizes GPU?

Can you please provide me with the specific steps for the operation?

Getting terrible results on M1 mac. And I'm not sure why.

I've successfully installed manga-ocr on my mac, but the output doesn't make any sense and does not resemble the input image in any way.
For example, I gave below image as a input, which was an official example in the readme, and the output text I'm getting is just "あ．．．"

Am I doing something wrong?

Degraded performace after reinstall.

Because conda gave me problems, i had to reinstall every package i had.

Before uninstalling, everything was fine. The model loaded in about 2 seconds and took less than a second to OCR most images (see the issue that i opened a while ago).

Now, after re installing, the model takes again 10 seconds to load, which could be passable, but it takes another 10 seconds or so for any image now, no matter how small.

I've tried both setting the environment variable again and pass the model with the pretrained model path argument (see above issue), but both don't solve the problem. Reinstalling and installing again won't change anything. What can I try?

ROCm support

I use a AMD GPU and am not able to run it just getting this error:

Traceback (most recent call last):
  File "/home/noah/Documents/AI/test/manga.py", line 1, in <module>
    from manga_ocr import MangaOcr
  File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/manga_ocr/__init__.py", line 3, in <module>
    from manga_ocr.ocr import MangaOcr
  File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/manga_ocr/ocr.py", line 5, in <module>
    import torch
  File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 234, in <module>
    _load_global_deps()
  File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 193, in _load_global_deps
    raise err
  File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 174, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/run/current-system/sw/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libstdc++.so.6: cannot open shared object file: No such file or directory

I think this may have to do with it trying to use Cuda, I also saw it trying to install Cuda dependencies when I pip installed. It would be nice to either be able to use CPU since the model is so small it likely would work good enough anyway, or use ROCm

Cannot find the requested files in the cached path.

Without internet connection, running MangaOcr returns this error:

File "manga_ocr\ocr.py", line 15, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
  File "transformers\models\auto\tokenization_auto.py", line 528, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "transformers\tokenization_utils_base.py", line 1732, in from_pretrained
    user_agent=user_agent,
  File "transformers\file_utils.py", line 1929, in cached_path
    local_files_only=local_files_only,
  File "transformers\file_utils.py", line 2178, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

I'm sure the first time I ran MangaOcr, it downloaded the pretrained model, but it seems it can't find it. Where was the model saved at? I've search around but no luck. I'm using python with anaconda on Windows 10.

And anyway, thanks for the work. Does its job flawlessly.

FIX: Troubleshooting for M1 MacOs users

I own two MacBooks and I was able to install manga-ocr without any problems on the Intel One. However, in the Mac that has a M1 (ARM) processor, I was getting an error when running pip3 install manga-ocr. Reading the error message, I noticed that there was a dependency that was having problems with the ARM architecture, it was mecab-python3.

I searched for some issues on mecab's repository and it seems that several users with the same setup, a M1 Mac, were facing a similar issue. This happens because mecab-python3 doesn't have a wheel for ARM architectures, so users with a M1 processor must build that pip package by themselves. This sounds hard, but in practice is sooooo easy:

cd ~

pip3 download mecab-python3

tar xfv mecab-python3-1.0.5.tar.gz

cd mecab-python3-1.0.5    

brew install mecab

python3 setup.py build

python3 setup.py install

After this, you can run pip3 list to verify that mecab package is installed in your system.

Now, if you run pip3 install manga-ocr again, it will be installed as expected.

Hopefully you can add this workaround in the README, so any user that encounter this can fix it.

P.S. All of this was done using python 3.9.13 installed and managed by pyenv

Regards!!!!

error: legacy-install-failure

It downloads the model every time I run it in command line

OS: windows 10
In your tutorial is writes: "When running for the first time, downloading the model (~400 MB) might take a few minutes. "
It's natural that once the model is download it don't need to download anymore. I found that it says in the help document that there's a flag "-p, --pretrained_model_name_or_path=PRETRAINED_MODEL_NAME_OR_PATH", so should I add a local folder path which links to the model already downloaded?
Besides that everything works perfectly. Thanks a lot for your excellent work.

Temporarily disabling manga-ocr

My workflow currently looks like this:

Copy image to clipboard
Let the OCR do its job
Let Yomichan detect the text from the clipboard
Create Anki card via Yomichan

What I want is a workflow which looks like this:

Copy image to clipboard
Let the OCR do its job
Let Yomichan detect the text from the clipboard
Create a screenshot of the manga panel/context
Create Anki card via Yomichan which automatically inserts the screenshot created in step 4 into the card

Since manga-ocr constantly runs in the background and parses those images and replaces the clipboard's content with whatever it recognised, it gets in the way of the last two steps. Is there some way to still maintain being able to copy screenshots to the clipboard while manga-ocr is running?

My idea how to circumvent this issue: A hotkey that temporarily halts manga-ocr. I could just press that hotkey just before I take the screenshot for the card.

Can this be implemented into the software? Do you have another solution to this problem which doesn't even require this feature?

Is there a way to optimize transfomers backend binary size?

Hello..
Currently I am trying to bundle a deb package for Poricom which uses manga_ocr.
However, the torch dependencies of the transformers backend are making the deb excessively large.

$ find . -type f -size +50M -exec du -sh {} \;

318M    ./opt/poricom/libcublasLt.so.11
145M    ./opt/poricom/libcublas.so.11
179M    ./opt/poricom/unidic_lite/dicdir/sys.dic
69M     ./opt/poricom/unidic_lite/dicdir/matrix.bin
438M    ./opt/poricom/libcudnn_cnn_infer.so.8
88M     ./opt/poricom/libcudnn_ops_infer.so.8
57M     ./opt/poricom/assets/languages/chi_tra.traineddata
526M    ./opt/poricom/torch/lib/libtorch_cpu.so
253M    ./opt/poricom/torch/lib/libtorch_cuda_cpp.so
728M    ./opt/poricom/torch/lib/libtorch_cuda_cu.so

I wonder if you could advise or help regarding reducing the size of the bundle by using another backend such as TF or JAX.

[Feauture request] Add training doc

I would like to train MangaOCR on a set of manga but in English and other languages. Yes, this contradicts the very logic of mangaOCR, but in times when it is faster to find works on English sites, but need to be translated to some other site, this would be very convenient. If possible, I could provide a dataset from such a manga.

There is also a big problem with mangaOCR fonts. It can collect a couple of dozen fonts and make an automatic dataset from them for recognition, as well as make “distortions” so that the neural network learns to recognize even in conditions of very incomprehensible text

OCR Issue

Hi I was trying to ocr some lightnovel screenshots, however none of the images I used came back with decent results, instead a random Japanese sentence that wasn't in the text came up. Any way to fix this? Thanks.

How did you Train your Model

Hi can you say how you did train the model like to train a model.

'NoneType' object has no attribute 'shape'

I get this error when cropping the manga images. More specifically when defining the parameters of the cropping:
xmax = max(box.xmax + margin, img.shape[0])

I would appreciate it if someone could help me.
Thanks!

Is it possible to convert the pretrained model into onnx format?

Hi, after reading the code, it seems that the pretrained weights need to be used in conjunction with a tokenizer and some other libraries:

x = self._preprocess(img)
x = self.model.generate(x[None].to(self.model.device), max_length=300)[0].cpu()
x = self.tokenizer.decode(x, skip_special_tokens=True)
x = post_process(x)

def post_process(text):
    text = ''.join(text.split())
    text = text.replace('…', '...')
    text = re.sub('[・.]{2,}', lambda x: (x.end() - x.start()) * '.', text)
    text = jaconv.h2z(text, ascii=True, digit=True)

    return text

This makes it seem like there's no way to convert the model to the ONNX format for use in other languages. Do you have any thoughts on how to achieve this? I don't know too much about it, thank you very much!

Stuck on 'Reading from directory'

I'm running python 3.10.13 on mac os and it never seems to start the ocr.

Never progresses beyond the following line:
manga_ocr.run:run:110 - Reading from directory /Users/user/Documents/漫画/薬屋のひとりごと 01-12巻/薬屋のひとりごと 01巻

(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

Hi, I'm new at all of this! I'm a Mac user and when I type manga_ocr I get the following error:

manga_ocr
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/bin/manga_ocr", line 5, in <module>
    from manga_ocr.__main__ import main
  File "/Users/julisco/Library/Python/3.10/lib/python/site-packages/manga_ocr/__init__.py", line 3, in <module>
    from manga_ocr.ocr import MangaOcr
  File "/Users/julisco/Library/Python/3.10/lib/python/site-packages/manga_ocr/ocr.py", line 5, in <module>
    import torch
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 217, in <module>
    _load_global_deps()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 177, in _load_global_deps
    raise err
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 172, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: dlopen(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib, 0x000A): tried: '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (no such file), '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

Can someone help? Thanks

MangaOcr tries to load an example for some reason

I've installed manga-ocr from the sources using python setup.py install --local. Now running manga_ocr I get the following error:

2023-07-18 22:51:12.411 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
/home/user/.local/lib/python3.11/site-packages/transformers-4.31.0-py3.11.egg/transformers/models/vit/feature_extraction_vit.py:28: FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.
  warnings.warn(
2023-07-18 22:51:20.711 | INFO     | manga_ocr.ocr:__init__:25 - Using CPU
Traceback (most recent call last):
  File "/home/user/.local/bin/manga_ocr", line 33, in <module>
    sys.exit(load_entry_point('manga-ocr==0.1.10', 'console_scripts', 'manga_ocr')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/__main__.py", line 7, in main
    fire.Fire(run)
  File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/run.py", line 64, in run
    mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/ocr.py", line 30, in __init__
    self(example_path)
  File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/ocr.py", line 36, in __call__
    img = Image.open(img_or_path)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/Pillow-10.0.0-py3.11-linux-x86_64.egg/PIL/Image.py", line 3218, in open
    fp = builtins.open(filename, "rb")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/assets/example.jpg'

It seems like some missing example.jpg is causing a problem, I don't understand why.

Doesn't recognize at all

from manga_ocr import MangaOcr

mocr = MangaOcr()
text = mocr('Shingeki No Kyojin - Raw/_Chapter 6/003.jpg')
print(text)

Got そして、日本では

I can't get what's wrong it's like common image + others are return something wrong too
Python 3.9.11
Windows 10

"Error while reading from clipboard" when running in background

I've successfully set up clipboard monitoring with manga_ocr as per the docs and it successfully parses the images that I copy while it's running.

However, I notice that every minute or so after a successful copy, it prints out the error:

manga_ocr.run:run:83 - Error while reading from clipboard

This seems harmless but it does add noise to the output and may hide other real problems so posting here for tracking after not finding any similar disucssion.

Running Windows 10

TypeError: image must be numpy array type

Thank you for your open source.

when I train code with train.py, error occured.

ile "/data/anaconda3/envs/imt_ocr/lib/python3.6/site-packages/albumentations/core/composition.py", line 251, in _check_args
raise TypeError("{} must be numpy array type".format(data_name))
TypeError: image must be numpy array type

how can I do?

plus, I set path correctly but, one more error occured.
findDecoder imread_ can't open/read file: check file path/integrity.

manga-ocr can't read file - permision denied

After installing both manga-ocr and shareX (Windows 10) using the recommended settings on shareX I get the following error after capturing a region :
WARNING | manga_ocr.run:run:112 - Error while reading file C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03: [Errno 13] Permission denied: 'C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03'

I run manga-ocr from the command prompt as my user. installed python 3.9 from their website.

[Feature Request] Allow the API to return the captured text mask

Is it possible to return the detected mask in another method? It should look something like this:

Class MangaOcr(...):
   ...
def __call__(...):
    ...
def text(...):
    # same as __call__
    __call__(...)
def mask(...):
    # modified __call__ where text mask is the output (same size with input)

Numbers are not being recognized

I have noticed that the OCR is not capable of recognizing numerals anymore. I even tried with the images that the repository has an example and it omits the number in the text that it returns back.

It's not a big deal as the important thing is to recognize the kanji, but I just wanted to report it just in case.

Btw, when I start manga_ocr now I am receiving this message:

UserWarning: Neither max_length nor max_new_tokens has been set, max_length will default to 300 (self.config.max_length). Controlling max_length via the config is deprecated and max_length will be removed from the config in v5 of Transformers -- we recommend using max_new_tokens to control the maximum length of the generation.

Value Error

ValueError: Invalid value of img_or_path: [[254 254 254 ... 254 254 254]
 [254 254 254 ... 254 254 254]
 [254 254 254 ... 254 254 254]
 ...
 [254 254 254 ... 254 254 254]
 [254 254 254 ... 254 254 254]
 [254 254 254 ... 254 254 254]]

Simplified Linux clipboard support

According to the README,

Reading images from clipboard works only on Windows and macOS, on Linux you should read from a directory instead.

Checking the logic, there's two distinct ways manga_ocr handles clipboard:
On Wayland, the pyperclip library is used to get clipboard contents

manga-ocr/manga_ocr/run.py

Line 72 in b0cac91

pyperclip.set_clipboard("wl-clipboard")

while on Windows and MacOS it's through Pillow's PIL.ImageGrab.grabclipboard

manga-ocr/manga_ocr/run.py

Line 93 in b0cac91

img = ImageGrab.grabclipboard()

Although the documentation for grabclipboard() says

Only macOS and Windows are currently supported.

it's undocumented but the Pillow source shows that Linux support was added in 9.4.0 (using wl-paste and xclip).

Thus one can change the existing code from

manga-ocr/manga_ocr/run.py

Lines 66 to 86 in b0cac91

 if sys.platform not in ('darwin', 'win32') and write_to == 'clipboard': 

 # Check if the system is using Wayland 

 import os 

 if os.environ.get('WAYLAND_DISPLAY'): 

 # Check if the wl-clipboard package is installed 

 if os.system("which wl-copy > /dev/null") == 0: 

 pyperclip.set_clipboard("wl-clipboard") 

 else: 

 msg = 'Your session uses wayland and does not have wl-clipboard installed. ' \ 

 'Install wl-clipboard for write in clipboard to work.' 

 raise NotImplementedError(msg) 

 if read_from == 'clipboard': 

 if sys.platform not in ('darwin', 'win32'): 

 msg = 'Reading images from clipboard works only on macOS and Windows. ' \ 

 'On Linux, run "manga_ocr /path/to/screenshot/folder" to read images from a folder instead.' 

 raise NotImplementedError(msg) 

 from PIL import ImageGrab 

 logger.info('Reading from clipboard')

        # if sys.platform not in ('darwin', 'win32') and write_to == 'clipboard':
        #     # Check if the system is using Wayland
        #     import os
        #     if os.environ.get('WAYLAND_DISPLAY'):
        #         # Check if the wl-clipboard package is installed
        #         if os.system("which wl-copy > /dev/null") == 0:
        #             pyperclip.set_clipboard("wl-clipboard")
        #         else:
        #             msg = 'Your session uses wayland and does not have wl-clipboard installed. ' \
        #                 'Install wl-clipboard for write in clipboard to work.'
        #             raise NotImplementedError(msg)

        if read_from == 'clipboard':

            # if sys.platform not in ('darwin', 'win32'):
            #     msg = 'Reading images from clipboard works only on macOS and Windows. ' \
            #         'On Linux, run "manga_ocr /path/to/screenshot/folder" to read images from a folder instead.'
            #     raise NotImplementedError(msg)

            from PIL import ImageGrab
            logger.info('Reading from clipboard')

and preserve support for Wayland while adding support for Linux systems running X.
This requires Pillow>=9.4.0 and pyperclip can probably be removed.
(see master...stephen-huan:manga-ocr:linux-clipboard for an implementation)

If neither wl-paste nor clip are available, the NotImplementedError should propagate through the try catch.

Note that the implementation of grabclipboard() on Linux is currently a bit wonky, see python-pillow/Pillow#7147.
In particular the current implementation will (a) generate a lot of unnecessary temporary files and (b) raise either ChildProcessError from xclip reporting Error: target image/png not available or UnidentifiedImageError from when Pillow tries to parse non-image data, e.g. plaintext when the other operating systems return None on invalid data.

It may be worthwhile to temporarily patch

            try:
                img = ImageGrab.grabclipboard()
            except OSError:
                logger.warning('Error while reading from clipboard')

           try:
               img = ImageGrab.grabclipboard()
           except (ChildProcessError, UnidentifiedImageError) as error:
               logger.trace(error)
           except OSError:
               logger.warning('Error while reading from clipboard')

since the default level of loguru is debug, using trace for the parse errors that occur every delay_secs prevents the user from being spammed when running with the default command manga_ocr. However, using the LOGURU_LEVEL environmental variable allows seeing every logging message, for debugging reasons.

LOGURU_LEVEL=TRACE manga_ocr

It may also be worthwhile to temporarily maintain a patched implementation of grabclipboard() from Pillow, at least until the next quarterly release of Pillow comes out (on July 1st). One should also be sure to copy the license.

Got an error when trying to install on debian

I did python3 -m pip install manga-ocr, that flagged a numpy error so I removed the system numpy and reinstalled it with pip into ~/.local/lib/site-packages. Then when running python I got this error when creating a MangaOcr object. Any suggestions? System is aarch64 linux raspberrypi 4.

pi@raspberrypi:/tmp $ python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.

from manga_ocr import MangaOcr
mocr = MangaOcr()
2022-12-11 10:57:31.915 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base
2022-12-11 10:57:42.731 | INFO | manga_ocr.ocr:init:22 - Using CPU
Traceback (most recent call last):
File "", line 1, in
File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 27, in init
self(example_path)
File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 33, in call
img = Image.open(img_or_path)
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 2904, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/home/pi/.local/lib/python3.9/site-packages/assets/example.jpg'

Create a Nix package

It would be nice if a nix package was created so it's easier to install on NixOS systems

I keep getting this error when trying to install via powershell

File "C:\Program Files\Python312\Lib\subprocess.py", line 389, in call
with Popen(*popenargs, **kwargs) as p:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\subprocess.py", line 1026, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\Python312\Lib\subprocess.py", line 1538, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 11, in <module>
    main()
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 7, in main
    fire.Fire(run)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\run.py", line 64, in run
    mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\ocr.py", line 15, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\auto\tokenization_auto.py", line 514, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1773, in from_pretrained
    return cls._from_pretrained(
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1908, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 151, in __init__
    self.word_tokenizer = MecabTokenizer(
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 231, in __init__
    import fugashi
  File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fugashi\__init__.py", line 1, in <module>
    from .fugashi import *
ImportError: DLL load failed while importing fugashi: The specified module could not be found.```

Also, possibly unrelated, but when I first installed it with pip install manga-ocr it had the following error:

```Installing collected packages: urllib3, pyparsing, idna, colorama, charset-normalizer, certifi, typing-extensions, tqdm, six, requests, regex, pyyaml, packaging, joblib, filelock, click, win32-setctime, tokenizers, termcolor, sacremoses, numpy, huggingface-hub, unidic-lite, transformers, torch, pyperclip, Pillow, loguru, jaconv, fugashi, fire, manga-ocr
  WARNING: The script normalizer.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tqdm.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
    Running setup.py install for termcolor ... done
  WARNING: The script sacremoses.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script f2py.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script huggingface-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
    Running setup.py install for unidic-lite ... done
  WARNING: The script transformers-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\Lubbs\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python38\\site-packages\\caffe2\\python\\serialized_test\\data\\operator_test\\piecewise_linear_transform_test.test_multi_predictions_params_from_arg.zip'```

but I tried it again and it installed successfully.

	if sys.platform not in ('darwin', 'win32') and write_to == 'clipboard':
	# Check if the system is using Wayland
	import os
	if os.environ.get('WAYLAND_DISPLAY'):
	# Check if the wl-clipboard package is installed
	if os.system("which wl-copy > /dev/null") == 0:
	pyperclip.set_clipboard("wl-clipboard")
	else:
	msg = 'Your session uses wayland and does not have wl-clipboard installed. ' \
	'Install wl-clipboard for write in clipboard to work.'
	raise NotImplementedError(msg)

	if read_from == 'clipboard':

	if sys.platform not in ('darwin', 'win32'):
	msg = 'Reading images from clipboard works only on macOS and Windows. ' \
	'On Linux, run "manga_ocr /path/to/screenshot/folder" to read images from a folder instead.'
	raise NotImplementedError(msg)

	from PIL import ImageGrab
	logger.info('Reading from clipboard')

kha-white / manga-ocr Goto Github PK

manga-ocr's People

Contributors

Stargazers

Watchers

Forkers

manga-ocr's Issues

Recommend Projects

Recommend Topics

Recommend Org