Light

deepayan137 / adapting-ocr Goto Github PK

View Code? Open in Web Editor NEW

56.0 3.0 26.0 2.51 MB

Pytorch implementation of our paper: Adapting OCR with Limited Labels

License: MIT License

Python 27.64% Shell 0.06% Jupyter Notebook 72.30%

self-training domain-adaptation ocr transfer-learning

adapting-ocr's Introduction

Adapting-OCR

Pytorch implementation of our paper Adapting OCR with limited labels

Dependency

This work was tested with PyTorch 1.2.0, CUDA 9.0, python 3.6 and Ubuntu 16.04.
requirements can be found in the file.
Also, please do a pip install pytorch-pretrained-bert as one of our kind contributors pointed out :)
command to create environment from the file is conda create -n pytorch1.4 --file env.txt
To activate the environment use: source activate pytorch1.4

Training

Supervised training

python -m train --name exp1 --path path/to/data

Main arguments
- --name: creates a directory where checkpoints will be stored
- --path: path to dataset.
- --imgdir: dir name of dataset
Semi-supervised training

python -m train_semi_supervised --name exp1 --path path --source_dir src_dirname --target_dir tgt_dirname --schedule --noise --alpha=1

Main arguments
- --name: creates a directory where checkpoints will be stored
- --path: path to datasets
- --source_dir: labelled data directory on which ocr was trained
- --target_dir: unlabeled data directory on which we want to adapt ocr
- --percent: percentage of unlabeled data to include in self-training
- --schedule: will include STLR scheduler while training
- --train_on_pred: will treat top-predictions as targets
- --noise: will add gaussian noise to images while training
- --alpha: set to 1 to include the mixup criterion
- --combine_scoring: will also take into account the scores outputted by a language model

Note: --combine_scoring works only with line images not word images

Data
- Use trdg to generate synthetic data. The script for data generation is included scrips/generate_data.sh.
- Download two different fonts and keep the data pertaining to each font in source and target dirs.
- Use one of the fonts to train data from scratch in a supervised manner.
- Then finetune the trained model on target data using semi-supervised learning
- A sample lexicon is provided in words.txt. Download different lexicon as per need.

References

The OCR architecture is a CNN-LSTM model borrowed from here
The mixup criterion code is borrowed from here
STLR is borrowed from this paper

adapting-ocr's People

Contributors

Stargazers

Watchers

adapting-ocr's Issues

Missing “English.data.pkl” from data/train.

Hi. I am trying to train the model on my custom dataset to read numbers only from ROIs. Dataset consists of a folder with images with naming convention as “ocr_reading_img.jpg”. I am getting missing file error for “English.Data.pkl”. Could anyone tell me what this file is and where to get it from? @Deepayan137

how can I use this CTC code for predicting sentences instead of words?

how can I use this CTC code for predicting sentences instead of words?
we have sentences of 12 words in a crop /

about -m train

No module named train

error while running the data generation script

I am a Windows user. I tried running the following command for data generation on the terminal

trdg -i words.txt -c 1000 --output_dir source_dir -ft fonts/Arimo-Regular.ttf

and got the following error

OSError: [Errno 22] Invalid argument: 'source_dir\\you?_379.jpg'

how to predict an image after training?

Hi, I trained the model using your code on my dataset, how to predict an image beyond the dataset?

U are almost trash

after setting all the sheet up! it requires pytorch_pretrained_bert.

u as the writer are almost garbage

How to train for RTL languages?

I'm trying to use code from this repo to train OCR for Arabic language. Additionally, the dataset has hand written text which makes it even complicated, but this is just for learning purposes.

I trained a model without any modifications to the workflow and the model architecture except for the dataset class. The model is giving me 0 WA and CA. So, I wanted to ask what possible changes we need to make the model learn?

So far, I'm trying to reverse the targets as Arabic is an RTL language unlike English. And to pad the images to the left instead of right. I'd like to hear your thoughts on this. Thanks.

How to use this for Custom Dataset?

Hi,
I would like to use your work on my dataset. Please, guide me how can prepare my dataset so that I can use this algo over it.
I have receipt dataset amounts and product name is hand written over it. How should I prepare my data to get these handwritten extracted. Your suggestions would help.

Thanks.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.