kha-white / manga-ocr Goto Github PK
View Code? Open in Web Editor NEWOptical character recognition for Japanese text, with the main focus being Japanese manga
License: Apache License 2.0
Optical character recognition for Japanese text, with the main focus being Japanese manga
License: Apache License 2.0
everytime i try to install this package, i get this one error
"""C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\Felix\AppData\LocildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" /EHsc /Tpsrc/sentencepiece/sentencepiece_wrap.cxx /Fobuild\temp.win-amd64-cpython-311\Release\src/sentencepiece/sentencepiece_wrap.obj /std:c++17 /MT /I..\build\root\include
cl : Befehlszeile warning D9025 : "/MD" wird durch "/MT" \x81berschrieben
sentencepiece_wrap.cxx
src/sentencepiece/sentencepiece_wrap.cxx(2822): fatal error C1083:
File (include) cannot be opened "sentencepiece_processor.h": No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> sentencepiece
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure."""
i already saw the "File (include) cannot be opened "sentencepiece_processor.h" warning but ill get the same error when i try to reinstall sentencepiece
has someone and idea how i fix this problem and get this repo running?
It would be nice if the downloaded models could be cached somewhere (with an optional argument).
All three AutoFeatureExtractor
, AutoTokenizer
and VisionEncoderDecoderModel
classes have the same from_pretrained()
method that allow the setting of a cache_dir
optional argument.
This way, it wouldnt be necessary to re-download the model for each usage. Mokuro could also benefit from such a caching logic.
PS : Thanks for this amazing software :)
Like korean chinese etc.
Hello, I wanted to inform the team that I have implemented their system in a new open source software called Visual GPT Translator (VGT). This software provides us with a GUI to take screenshots, extract Japanese text from them (using MangaOCR), and translate it using the GPT 3.5 API.
I thought you could list it in the related projects section of your readme.
The GUI is based on ElectronJs and ReactJs, while also running a local server based on FastAPI. It is in this server where Manga-Ocr is implemented. The project also demonstrates how it is possible to use PyInstaller/Electron-Packager to compile both the GUI and server with the OCR into an executable.
Thank you very much for your work :)
Hello im trying to write a code to automate translation. I'm using the shell command to do this. The problem is, that since this program keeps running in the background, it puts my JS program on a hold until it hears something back from exited shell command. Would it be possible to put a flag that exits after recognizing the code?
Could you please provide details about the training and testing recognition module?
If I stand incorrect and the framework e.c.t is stored in the cache please still read this * part. Thanks.
Can't we all not just download the necessary transformers and such, Poricom can do it and utilizes exactly the same ingredients (manga_ocr) I love the simplicity of manga_ocr, Poricom is horrid UI really terrible; so I really appreciate manga_ocr just don't see the point of connecting to a sever to access this "Transformers' [Vision Encoder Decoder] framework" why can't all this jazz be put local. It makes more sense it should be "built in" but it's not because we all have to connect to K White's sever.
As titled.Planning to write an adaptation plugin of your amazing engine to an OCR tools.And I found installing and seperating the project itself from it's dependencies (which is essential to reduce the disk usage of the plugin) difficult.So I was wondering whether this is possible because I noticed the Running in Background
in README.
Hi there, total noob here, I got to the point where I have to import MangaOcr but Terminal response is:
from: can't read /var/mail/manga_ocr
I've also tried with: import PIL.Image ... but no luck either.
Any clue?
Thanks a lot in advance
BTW I'm on an Intel PB
Hi, thanks a lot for this nice repo. I starred it :)
I have a question that came up to my mind by looking at it :
Would it be ok to use this tool on a recent CPU (I7-8XXX) ? Have you a sort of mini benchmark that compares the inference speed between GPU(s) and CPU(s) ?
Thanks again !
I am using Manga-ocr in Poricom, but constantly having issues with internet connection. I am getting error "Please try again or make sure your internet connection is on" with MangaOCR model.
Poricom / MangaOCR saves offline copy of the pretrained model in %UserProfile%.cache\huggingface\transformers. I checked the model is indeed there.
As I understand from issues#9 this is an issue with HuggingFace Transformers, even if the model is actually being stored in cache off-line, it still needs internet connection to load that model from it.
From what I understand I should set TRANSFORMERS_OFFLINE=1 and HF_DATASETS_OFFLINE=1 to run image recognition in a firewalled or offline environment by only using local files.
Can someone please help me to implement it in local instance of Manga-ocr such that I can then use it in local instance of Poricom offline. I have little experience and have no idea where else to ask. Would appreciate if someone helps me to figure out which files should I change to add those values and implement offline option.
Requirement already satisfied: numpy in c:\users\lenovo\appdata\local\programs\python\python310\lib\site-packages (1.22.3)
UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xf
Title says it all.
I am on fedora 37 with kde and wayland.
Hi,
When the OCR model is being loaded on the first run, the following warning is shown:
FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.
When I try to get the text from the following image:
manga-ocr gives me わたさんあたしが未満しててもそんなのおだまいないこいに山田太くってきちゃうところかわいい
.
Using https://2ocr.com/online-ocr-japanese I get れなさんわたしが未読しててもそんなのおかまいなしにLINEおくってきちゃうところかわいい
, which seems to be correct, or at least more correct than what manga-ocr gives me.
Feel welcome to change the title of this ticket to something more appropriate.
I try to uninstall and reinstall 3.913version on the python official website, but still encounter this issue.
Add MPS device type so that M1/M2 gpus can be utilised
See: https://pytorch.org/docs/stable/notes/mps.html
if not torch.backends.mps.is_available():
if not torch.backends.mps.is_built():
print("MPS not available because the current PyTorch install was not "
"built with MPS enabled.")
else:
print("MPS not available because the current MacOS version is not 12.3+ "
"and/or you do not have an MPS-enabled device on this machine.")
else:
mps_device = torch.device("mps")
# Create a Tensor directly on the mps device
x = torch.ones(5, device=mps_device)
# Or
x = torch.ones(5, device="mps")
# Any operation happens on the GPU
y = x * 2
# Move your model to mps just like any other device
model = YourFavoriteNet()
model.to(mps_device)
# Now every call runs on the GPU
pred = model(x)
I get this error when running the train.py module.
[/content/manga_ocr/manga_ocr_dev/training/get_model.py](https://localhost:8080/#) in __init__(self, feature_extractor, tokenizer)
7
8 def __init__(self, feature_extractor, tokenizer):
----> 9 self.feature_extractor = feature_extractor
10 self.tokenizer = tokenizer
11 self.current_processor = self.feature_extractor
AttributeError: can't set attribute 'feature_extractor'
If you want to run with GPU, install PyTorch as described here, otherwise this step can be skipped.
I only see this sentence, but I don't understand how to install and run on an NVIDIA GPU. Can you teach me?
I don't have a coding background, so I don't know how to install it or how to run it after installation
Can you provide a version that directly utilizes GPU?
Can you please provide me with the specific steps for the operation?
I've successfully installed manga-ocr on my mac, but the output doesn't make any sense and does not resemble the input image in any way.
For example, I gave below image as a input, which was an official example in the readme, and the output text I'm getting is just "あ..."
Am I doing something wrong?
Because conda gave me problems, i had to reinstall every package i had.
Before uninstalling, everything was fine. The model loaded in about 2 seconds and took less than a second to OCR most images (see the issue that i opened a while ago).
Now, after re installing, the model takes again 10 seconds to load, which could be passable, but it takes another 10 seconds or so for any image now, no matter how small.
I've tried both setting the environment variable again and pass the model with the pretrained model path
argument (see above issue), but both don't solve the problem. Reinstalling and installing again won't change anything. What can I try?
I use a AMD GPU and am not able to run it just getting this error:
Traceback (most recent call last):
File "/home/noah/Documents/AI/test/manga.py", line 1, in <module>
from manga_ocr import MangaOcr
File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/manga_ocr/__init__.py", line 3, in <module>
from manga_ocr.ocr import MangaOcr
File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/manga_ocr/ocr.py", line 5, in <module>
import torch
File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 234, in <module>
_load_global_deps()
File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 193, in _load_global_deps
raise err
File "/home/noah/Documents/AI/test/manga_ocr/lib/python3.11/site-packages/torch/__init__.py", line 174, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/run/current-system/sw/lib/python3.11/ctypes/__init__.py", line 376, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libstdc++.so.6: cannot open shared object file: No such file or directory
I think this may have to do with it trying to use Cuda, I also saw it trying to install Cuda dependencies when I pip installed. It would be nice to either be able to use CPU since the model is so small it likely would work good enough anyway, or use ROCm
Without internet connection, running MangaOcr
returns this error:
File "manga_ocr\ocr.py", line 15, in __init__
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
File "transformers\models\auto\tokenization_auto.py", line 528, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "transformers\tokenization_utils_base.py", line 1732, in from_pretrained
user_agent=user_agent,
File "transformers\file_utils.py", line 1929, in cached_path
local_files_only=local_files_only,
File "transformers\file_utils.py", line 2178, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
I'm sure the first time I ran MangaOcr
, it downloaded the pretrained model, but it seems it can't find it. Where was the model saved at? I've search around but no luck. I'm using python with anaconda on Windows 10.
And anyway, thanks for the work. Does its job flawlessly.
I own two MacBooks and I was able to install manga-ocr
without any problems on the Intel One. However, in the Mac that has a M1 (ARM) processor, I was getting an error when running pip3 install manga-ocr
. Reading the error message, I noticed that there was a dependency that was having problems with the ARM architecture, it was mecab-python3
.
I searched for some issues on mecab's repository and it seems that several users with the same setup, a M1 Mac, were facing a similar issue. This happens because mecab-python3
doesn't have a wheel for ARM architectures, so users with a M1 processor must build that pip package by themselves. This sounds hard, but in practice is sooooo easy:
cd ~
pip3 download mecab-python3
tar xfv mecab-python3-1.0.5.tar.gz
cd mecab-python3-1.0.5
brew install mecab
python3 setup.py build
python3 setup.py install
After this, you can run pip3 list
to verify that mecab package is installed in your system.
Now, if you run pip3 install manga-ocr
again, it will be installed as expected.
Hopefully you can add this workaround in the README, so any user that encounter this can fix it.
P.S. All of this was done using python 3.9.13 installed and managed by pyenv
Regards!!!!
OS: windows 10
In your tutorial is writes: "When running for the first time, downloading the model (~400 MB) might take a few minutes. "
It's natural that once the model is download it don't need to download anymore. I found that it says in the help document that there's a flag "-p, --pretrained_model_name_or_path=PRETRAINED_MODEL_NAME_OR_PATH", so should I add a local folder path which links to the model already downloaded?
Besides that everything works perfectly. Thanks a lot for your excellent work.
My workflow currently looks like this:
What I want is a workflow which looks like this:
Since manga-ocr constantly runs in the background and parses those images and replaces the clipboard's content with whatever it recognised, it gets in the way of the last two steps. Is there some way to still maintain being able to copy screenshots to the clipboard while manga-ocr is running?
My idea how to circumvent this issue: A hotkey that temporarily halts manga-ocr. I could just press that hotkey just before I take the screenshot for the card.
Can this be implemented into the software? Do you have another solution to this problem which doesn't even require this feature?
Hello..
Currently I am trying to bundle a deb package for Poricom which uses manga_ocr.
However, the torch dependencies of the transformers backend are making the deb excessively large.
$ find . -type f -size +50M -exec du -sh {} \;
318M ./opt/poricom/libcublasLt.so.11
145M ./opt/poricom/libcublas.so.11
179M ./opt/poricom/unidic_lite/dicdir/sys.dic
69M ./opt/poricom/unidic_lite/dicdir/matrix.bin
438M ./opt/poricom/libcudnn_cnn_infer.so.8
88M ./opt/poricom/libcudnn_ops_infer.so.8
57M ./opt/poricom/assets/languages/chi_tra.traineddata
526M ./opt/poricom/torch/lib/libtorch_cpu.so
253M ./opt/poricom/torch/lib/libtorch_cuda_cpp.so
728M ./opt/poricom/torch/lib/libtorch_cuda_cu.so
I wonder if you could advise or help regarding reducing the size of the bundle by using another backend such as TF or JAX.
I would like to train MangaOCR on a set of manga but in English and other languages. Yes, this contradicts the very logic of mangaOCR, but in times when it is faster to find works on English sites, but need to be translated to some other site, this would be very convenient. If possible, I could provide a dataset from such a manga.
There is also a big problem with mangaOCR fonts. It can collect a couple of dozen fonts and make an automatic dataset from them for recognition, as well as make “distortions” so that the neural network learns to recognize even in conditions of very incomprehensible text
Hi I was trying to ocr some lightnovel screenshots, however none of the images I used came back with decent results, instead a random Japanese sentence that wasn't in the text came up. Any way to fix this? Thanks.
Hi can you say how you did train the model like to train a model.
I get this error when cropping the manga images. More specifically when defining the parameters of the cropping:
xmax = max(box.xmax + margin, img.shape[0])
I would appreciate it if someone could help me.
Thanks!
Hi, after reading the code, it seems that the pretrained weights need to be used in conjunction with a tokenizer and some other libraries:
x = self._preprocess(img)
x = self.model.generate(x[None].to(self.model.device), max_length=300)[0].cpu()
x = self.tokenizer.decode(x, skip_special_tokens=True)
x = post_process(x)
def post_process(text):
text = ''.join(text.split())
text = text.replace('…', '...')
text = re.sub('[・.]{2,}', lambda x: (x.end() - x.start()) * '.', text)
text = jaconv.h2z(text, ascii=True, digit=True)
return text
This makes it seem like there's no way to convert the model to the ONNX format for use in other languages. Do you have any thoughts on how to achieve this? I don't know too much about it, thank you very much!
I'm running python 3.10.13 on mac os and it never seems to start the ocr.
Never progresses beyond the following line:
manga_ocr.run:run:110 - Reading from directory /Users/user/Documents/漫画/薬屋のひとりごと 01-12巻/薬屋のひとりごと 01巻
Hi, I'm new at all of this! I'm a Mac user and when I type manga_ocr
I get the following error:
manga_ocr
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/bin/manga_ocr", line 5, in <module>
from manga_ocr.__main__ import main
File "/Users/julisco/Library/Python/3.10/lib/python/site-packages/manga_ocr/__init__.py", line 3, in <module>
from manga_ocr.ocr import MangaOcr
File "/Users/julisco/Library/Python/3.10/lib/python/site-packages/manga_ocr/ocr.py", line 5, in <module>
import torch
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 217, in <module>
_load_global_deps()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 177, in _load_global_deps
raise err
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/__init__.py", line 172, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: dlopen(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib, 0x000A): tried: '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (no such file), '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))
Can someone help? Thanks
I've installed manga-ocr
from the sources using python setup.py install --local
. Now running manga_ocr
I get the following error:
2023-07-18 22:51:12.411 | INFO | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
/home/user/.local/lib/python3.11/site-packages/transformers-4.31.0-py3.11.egg/transformers/models/vit/feature_extraction_vit.py:28: FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.
warnings.warn(
2023-07-18 22:51:20.711 | INFO | manga_ocr.ocr:__init__:25 - Using CPU
Traceback (most recent call last):
File "/home/user/.local/bin/manga_ocr", line 33, in <module>
sys.exit(load_entry_point('manga-ocr==0.1.10', 'console_scripts', 'manga_ocr')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/__main__.py", line 7, in main
fire.Fire(run)
File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/fire-0.5.0-py3.11.egg/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/run.py", line 64, in run
mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/ocr.py", line 30, in __init__
self(example_path)
File "/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/manga_ocr/ocr.py", line 36, in __call__
img = Image.open(img_or_path)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/Pillow-10.0.0-py3.11-linux-x86_64.egg/PIL/Image.py", line 3218, in open
fp = builtins.open(filename, "rb")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.local/lib/python3.11/site-packages/manga_ocr-0.1.10-py3.11.egg/assets/example.jpg'
It seems like some missing example.jpg
is causing a problem, I don't understand why.
I've successfully set up clipboard monitoring with manga_ocr
as per the docs and it successfully parses the images that I copy while it's running.
However, I notice that every minute or so after a successful copy, it prints out the error:
manga_ocr.run:run:83 - Error while reading from clipboard
This seems harmless but it does add noise to the output and may hide other real problems so posting here for tracking after not finding any similar disucssion.
Running Windows 10
Thank you for your open source.
when I train code with train.py, error occured.
ile "/data/anaconda3/envs/imt_ocr/lib/python3.6/site-packages/albumentations/core/composition.py", line 251, in _check_args
raise TypeError("{} must be numpy array type".format(data_name))
TypeError: image must be numpy array type
how can I do?
plus, I set path correctly but, one more error occured.
findDecoder imread_ can't open/read file: check file path/integrity.
After installing both manga-ocr and shareX (Windows 10) using the recommended settings on shareX I get the following error after capturing a region :
WARNING | manga_ocr.run:run:112 - Error while reading file C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03: [Errno 13] Permission denied: 'C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03'
I run manga-ocr from the command prompt as my user. installed python 3.9 from their website.
Is it possible to return the detected mask in another method? It should look something like this:
Class MangaOcr(...):
...
def __call__(...):
...
def text(...):
# same as __call__
__call__(...)
def mask(...):
# modified __call__ where text mask is the output (same size with input)
I have noticed that the OCR is not capable of recognizing numerals anymore. I even tried with the images that the repository has an example and it omits the number in the text that it returns back.
It's not a big deal as the important thing is to recognize the kanji, but I just wanted to report it just in case.
Btw, when I start manga_ocr now I am receiving this message:
UserWarning: Neither
max_length
normax_new_tokens
has been set,max_length
will default to 300 (self.config.max_length
). Controllingmax_length
via the config is deprecated andmax_length
will be removed from the config in v5 of Transformers -- we recommend usingmax_new_tokens
to control the maximum length of the generation.
ValueError: Invalid value of img_or_path: [[254 254 254 ... 254 254 254]
[254 254 254 ... 254 254 254]
[254 254 254 ... 254 254 254]
...
[254 254 254 ... 254 254 254]
[254 254 254 ... 254 254 254]
[254 254 254 ... 254 254 254]]
According to the README,
Reading images from clipboard works only on Windows and macOS, on Linux you should read from a directory instead.
Checking the logic, there's two distinct ways manga_ocr
handles clipboard:
On Wayland, the pyperclip library is used to get clipboard contents
Line 72 in b0cac91
PIL.ImageGrab.grabclipboard
Line 93 in b0cac91
Although the documentation for grabclipboard()
says
Only macOS and Windows are currently supported.
it's undocumented but the Pillow source shows that Linux support was added in 9.4.0 (using wl-paste
and xclip
).
Thus one can change the existing code from
Lines 66 to 86 in b0cac91
# if sys.platform not in ('darwin', 'win32') and write_to == 'clipboard':
# # Check if the system is using Wayland
# import os
# if os.environ.get('WAYLAND_DISPLAY'):
# # Check if the wl-clipboard package is installed
# if os.system("which wl-copy > /dev/null") == 0:
# pyperclip.set_clipboard("wl-clipboard")
# else:
# msg = 'Your session uses wayland and does not have wl-clipboard installed. ' \
# 'Install wl-clipboard for write in clipboard to work.'
# raise NotImplementedError(msg)
if read_from == 'clipboard':
# if sys.platform not in ('darwin', 'win32'):
# msg = 'Reading images from clipboard works only on macOS and Windows. ' \
# 'On Linux, run "manga_ocr /path/to/screenshot/folder" to read images from a folder instead.'
# raise NotImplementedError(msg)
from PIL import ImageGrab
logger.info('Reading from clipboard')
and preserve support for Wayland while adding support for Linux systems running X.
This requires Pillow>=9.4.0
and pyperclip
can probably be removed.
(see master...stephen-huan:manga-ocr:linux-clipboard for an implementation)
If neither wl-paste
nor clip
are available, the NotImplementedError
should propagate through the try catch.
Note that the implementation of grabclipboard()
on Linux is currently a bit wonky, see python-pillow/Pillow#7147.
In particular the current implementation will (a) generate a lot of unnecessary temporary files and (b) raise either ChildProcessError
from xclip
reporting Error: target image/png not available
or UnidentifiedImageError
from when Pillow tries to parse non-image data, e.g. plaintext when the other operating systems return None
on invalid data.
It may be worthwhile to temporarily patch
try:
img = ImageGrab.grabclipboard()
except OSError:
logger.warning('Error while reading from clipboard')
to
try:
img = ImageGrab.grabclipboard()
except (ChildProcessError, UnidentifiedImageError) as error:
logger.trace(error)
except OSError:
logger.warning('Error while reading from clipboard')
since the default level of loguru
is debug, using trace
for the parse errors that occur every delay_secs
prevents the user from being spammed when running with the default command manga_ocr
. However, using the LOGURU_LEVEL
environmental variable allows seeing every logging message, for debugging reasons.
LOGURU_LEVEL=TRACE manga_ocr
It may also be worthwhile to temporarily maintain a patched implementation of grabclipboard()
from Pillow, at least until the next quarterly release of Pillow comes out (on July 1st). One should also be sure to copy the license.
I did python3 -m pip install manga-ocr, that flagged a numpy error so I removed the system numpy and reinstalled it with pip into ~/.local/lib/site-packages. Then when running python I got this error when creating a MangaOcr object. Any suggestions? System is aarch64 linux raspberrypi 4.
pi@raspberrypi:/tmp $ python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
from manga_ocr import MangaOcr
mocr = MangaOcr()
2022-12-11 10:57:31.915 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base
2022-12-11 10:57:42.731 | INFO | manga_ocr.ocr:init:22 - Using CPU
Traceback (most recent call last):
File "", line 1, in
File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 27, in init
self(example_path)
File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 33, in call
img = Image.open(img_or_path)
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 2904, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/home/pi/.local/lib/python3.9/site-packages/assets/example.jpg'
It would be nice if a nix package was created so it's easier to install on NixOS systems
File "C:\Program Files\Python312\Lib\subprocess.py", line 389, in call
with Popen(*popenargs, **kwargs) as p:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\subprocess.py", line 1026, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\Python312\Lib\subprocess.py", line 1538, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
How does sharex call manga OCR? Can you tell me more about it? Thank you
I'm not a programmer and I don't understand the program, so I didn't understand the demonstration on the video you wrote. How should sharex and manga OCR be set before this action? Please tell me, thank you
Tried to test on some of my images and was not able to execute MangaOcr(). getting "AttributeError: module 'torch.backends' has no attribute 'mps'" tried to upgrade torch but still facing the same issue
This error came up when ran python -m manga_ocr for the first time, after it downloaded the data.
Here's the full error:
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 11, in <module>
main()
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 7, in main
fire.Fire(run)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\run.py", line 64, in run
mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\ocr.py", line 15, in __init__
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\auto\tokenization_auto.py", line 514, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1773, in from_pretrained
return cls._from_pretrained(
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1908, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 151, in __init__
self.word_tokenizer = MecabTokenizer(
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 231, in __init__
import fugashi
File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fugashi\__init__.py", line 1, in <module>
from .fugashi import *
ImportError: DLL load failed while importing fugashi: The specified module could not be found.```
Also, possibly unrelated, but when I first installed it with pip install manga-ocr it had the following error:
```Installing collected packages: urllib3, pyparsing, idna, colorama, charset-normalizer, certifi, typing-extensions, tqdm, six, requests, regex, pyyaml, packaging, joblib, filelock, click, win32-setctime, tokenizers, termcolor, sacremoses, numpy, huggingface-hub, unidic-lite, transformers, torch, pyperclip, Pillow, loguru, jaconv, fugashi, fire, manga-ocr
WARNING: The script normalizer.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script tqdm.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Running setup.py install for termcolor ... done
WARNING: The script sacremoses.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script f2py.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script huggingface-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Running setup.py install for unidic-lite ... done
WARNING: The script transformers-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\Lubbs\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python38\\site-packages\\caffe2\\python\\serialized_test\\data\\operator_test\\piecewise_linear_transform_test.test_multi_predictions_params_from_arg.zip'```
but I tried it again and it installed successfully.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.