Comments (6)
The issue with fugashi is unrelated, it keeps happening randomly. I have an idea how to remove this dependency so I might try to do it in a future release.
from manga-ocr.
Are you sure the model is running on GPU? Can you post the logs?
from manga-ocr.
The model has always been running on CPU, no cuda available. I'm not sure which logs you would want, since i've not seen any besides loading model
, using cpu
, and model ready
.
from manga-ocr.
I just wanted to see if it's using cuda, but in that case there's no need. I'm not sure what could cause this slowdown. One thing you could try is to ditch conda and try setting it up with virtualenv.
from manga-ocr.
Playing around with conda environments, something else happened. ModuleNotFoundError: You need to install fugashi to use MecabTokenizer
(see below), but fugashi is already installed! Uninstalling and reinstalling solves this issue, but the model is still slow as I said above. It's possible that it's somehow the conda environment's fault, but it's weird because it's basically the same as before. Thank you anyway
2023-09-07 14:57:24.654 | INFO | manga_ocr.ocr:__init__:13 - Loading OCR model from \.cache\huggingface\hub\models--kha-white--manga-ocr-base\snapshots\aa6573bd10b0d446cbf622e29c3e084914df9741
\lib\site-packages\transformers\models\vit\feature_extraction_vit.py:28: FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.
warnings.warn(
Exception in thread Thread-1:
Traceback (most recent call last):
File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 458, in __init__
import fugashi
File "\lib\site-packages\fugashi\__init__.py", line 1, in <module>
from .fugashi import *
ModuleNotFoundError: No module named 'fugashi.fugashi'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "\lib\threading.py", line 980, in _bootstrap_inner
self.run()
File "\lib\threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "myfile.py", line 104, in startup_mocr
self.mocr = MangaOcr(pretrained_model_name_or_path=r'\huggingface\hub\models--kha-white--manga-ocr-base\snapshots\aa6573bd10b0d446cbf622e29c3e084914df9741')
File "C\lib\site-packages\manga_ocr\ocr.py", line 15, in __init__
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
File "\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 727, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "\lib\site-packages\transformers\tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "\lib\site-packages\transformers\tokenization_utils_base.py", line 2017, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 211, in __init__
self.word_tokenizer = MecabTokenizer(
File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 460, in __init__
raise error.__class__(
ModuleNotFoundError: You need to install fugashi to use MecabTokenizer. See https://pypi.org/project/fugashi/ for installation.
from manga-ocr.
Installing every package with pip solved the issue, I'll close it. If i ever figure out what happened I'll update this.
from manga-ocr.
Related Issues (20)
- OCR Issue HOT 1
- MangaOcr tries to load an example for some reason HOT 2
- Is it possible to convert the pretrained model into onnx format? HOT 7
- How to deploy GPU for execution?
- error importing MangaOcr HOT 1
- Manga_ocr OFFLINE (enhancement)
- Stuck on 'Reading from directory'
- I keep getting this error when trying to install via powershell HOT 3
- [Feauture request] Add training doc
- 'NoneType' object has no attribute 'shape'
- AttributeError: can't set attribute 'feature_extractor'
- ROCm support HOT 1
- (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))
- Create a Nix package HOT 3
- Is it possible to release a binary with the models? HOT 1
- I have a problme when i try to download the manga_ocr_base from huggingface.co, and i fail to get the access HOT 1
- How to read the full contents of a manga page? HOT 2
- Warning message at start-up
- Error when closing manga-ocr
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from manga-ocr.