Giter Club home page Giter Club logo

Comments (5)

lukas-blecher avatar lukas-blecher commented on May 17, 2024

Hi,
there is a misunderstanding. The image_resizer weights correspond to an entirely different network, which is also why the file sizes are significantly different.
If you place the image_resizer.pth file in the same directory as weights.pth the script will automatically initialize the classification model used for resizing. For the code see here

LaTeX-OCR/pix2tex.py

Lines 57 to 61 in fadb042

if 'image_resizer.pth' in os.listdir(os.path.dirname(args.checkpoint)) and not arguments.no_resize:
image_resizer = ResNetV2(layers=[2, 3, 3], num_classes=max(args.max_dimensions)//32, global_pool='avg', in_chans=1, drop_rate=.05,
preact=True, stem_type='same', conv_layer=StdConv2dSame).to(args.device)
image_resizer.load_state_dict(torch.load(os.path.join(os.path.dirname(args.checkpoint), 'image_resizer.pth'), map_location=args.device))
image_resizer.eval()

For very large images the resizer might still not work perfectly because I only rendered images of a set resolution range. If you find that the resizer does not help very much, consider training your own using the train_resizer.py script.

from latex-ocr.

pvannyamelia avatar pvannyamelia commented on May 17, 2024

Thank you for the help

from latex-ocr.

pvannyamelia avatar pvannyamelia commented on May 17, 2024

By the way, would you share the resolution range that the model been trained on?

from latex-ocr.

lukas-blecher avatar lukas-blecher commented on May 17, 2024

Sure. The image sizes I trained on were at most 672x192 (wxh) and I've rendered the images randomly in between 110 and 170 dpi
Hope that answers your question.

from latex-ocr.

pvannyamelia avatar pvannyamelia commented on May 17, 2024

thank you very much 😄

from latex-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.