Giter Club home page Giter Club logo

project-marginalia's Introduction

Marginalia and Machine Learning

PyTorch implementation of a Handwritten Text Recognition (HTR) system that focuses on automatic detection and recognition of handwritten marginalia texts i.e., text written in margins or handwritten notes. Faster R-CNN network is used for detection of marginalia and AttentionHTR is used for word recognition. The data comes from early book collections (printed) found in the Uppsala University Library, with handwritten marginalia texts.

This is a work under progress. For more details, refer to our paper at arXiv.

Dependencies

To run the code, run the following

python3 -m venv marginalia-env
source marginalia-env/bin/activate
pip install --upgrade pip
python3 -m pip install -r Project-Marginalia/requirements.txt

Demo of our pre-trained model

Marginalia detection

  • Download the pre-trained model faster_r_cnn_weights.pt from here and place it into /Project-Marginalia/model/.
  • Create the folder Project-Marginalia/model/data/test_images/ and place in them the test images.
  • Create the folder Project-Marginalia/model/results/
  • To detect and visualize the marginalias, run python3 model/test.py

Marginalia Segmentation

  • If you want to use your model on your own set of images, use the image_to_bboxes.py script. In it you will have to add the path to your model, folder of your dataset, and the location where you want the predicted marginalia to be saved. Then run 'python3 image_to_bboxes.py'.
  • If you want to segment a set of marginalia to individual words, use the marginalia_to_words.py script. In it you will have to add the path to a folder containing images of predicted marginalia, as well as the folder where you want the results to be saved. Then run 'python3 marginalia_to_words.py'.

Word recognition using AttentionHTR

  • To recognise the words with AttentionHTR, follow the instructions from here

Acknowledgements

  • This work has been partially supported by the Uppsala-Durham Strategic Development Fund: "Marginalia and Machine Learning: a Study of Durham University and Uppsala University Marginalia Collections".
  • The computations/data handling were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at Chalmers Centre for Computational Science and Engineering (C3SE) partially funded by the Swedish Research Council through grant agreement no. 2018-05973, project Dnr: SNIC 2022/22-1084.
  • The authors would like to thank the Centre for Digital Humanities Uppsala (CDHU) and Uppsala University Library (Alvin) for offering the dataset.
  • This project was done as part of the Data Science project course, team: Adam Axelsson, Liang Cheng, Jonas Frankemölle, and Ekta Vats (supervisor).

References

[1]: Dmitrijs Kass and Ekta Vats. "AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks." International Workshop on Document Analysis Systems. Springer, Cham, 2022. Link Code

Contact

Ekta Vats ([email protected])

Adam Axelsson

Liang Cheng ([email protected])

Jonas Frankemölle ([email protected])

project-marginalia's People

Contributors

frankjonasmoelle avatar adamaxelsson avatar imchengliang avatar ektavats avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.