Giter Club home page Giter Club logo

Comments (6)

kha-white avatar kha-white commented on June 1, 2024 1

The issue with fugashi is unrelated, it keeps happening randomly. I have an idea how to remove this dependency so I might try to do it in a future release.

from manga-ocr.

kha-white avatar kha-white commented on June 1, 2024

Are you sure the model is running on GPU? Can you post the logs?

from manga-ocr.

blunderedbishop avatar blunderedbishop commented on June 1, 2024

The model has always been running on CPU, no cuda available. I'm not sure which logs you would want, since i've not seen any besides loading model, using cpu, and model ready.

from manga-ocr.

kha-white avatar kha-white commented on June 1, 2024

I just wanted to see if it's using cuda, but in that case there's no need. I'm not sure what could cause this slowdown. One thing you could try is to ditch conda and try setting it up with virtualenv.

from manga-ocr.

blunderedbishop avatar blunderedbishop commented on June 1, 2024

Playing around with conda environments, something else happened. ModuleNotFoundError: You need to install fugashi to use MecabTokenizer (see below), but fugashi is already installed! Uninstalling and reinstalling solves this issue, but the model is still slow as I said above. It's possible that it's somehow the conda environment's fault, but it's weird because it's basically the same as before. Thank you anyway

2023-09-07 14:57:24.654 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from \.cache\huggingface\hub\models--kha-white--manga-ocr-base\snapshots\aa6573bd10b0d446cbf622e29c3e084914df9741
\lib\site-packages\transformers\models\vit\feature_extraction_vit.py:28: FutureWarning: The class ViTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ViTImageProcessor instead.   
  warnings.warn(
Exception in thread Thread-1:
Traceback (most recent call last):
  File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 458, in __init__
    import fugashi
  File "\lib\site-packages\fugashi\__init__.py", line 1, in <module>
    from .fugashi import *
ModuleNotFoundError: No module named 'fugashi.fugashi'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "\lib\threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "myfile.py", line 104, in startup_mocr
    self.mocr = MangaOcr(pretrained_model_name_or_path=r'\huggingface\hub\models--kha-white--manga-ocr-base\snapshots\aa6573bd10b0d446cbf622e29c3e084914df9741')
  File "C\lib\site-packages\manga_ocr\ocr.py", line 15, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
  File "\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 727, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "\lib\site-packages\transformers\tokenization_utils_base.py", line 1854, in from_pretrained
    return cls._from_pretrained(
  File "\lib\site-packages\transformers\tokenization_utils_base.py", line 2017, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 211, in __init__
    self.word_tokenizer = MecabTokenizer(
  File "\lib\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 460, in __init__
    raise error.__class__(
ModuleNotFoundError: You need to install fugashi to use MecabTokenizer. See https://pypi.org/project/fugashi/ for installation.

from manga-ocr.

blunderedbishop avatar blunderedbishop commented on June 1, 2024

Installing every package with pip solved the issue, I'll close it. If i ever figure out what happened I'll update this.

from manga-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.