SE_ASTER

Introduction

This is the implementation of the paper "SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition" This code is based on the aster.pytorch, we sincerely thank ayumiymk for his awesome repo and help.

How to use

Env

PyTorch == 1.1.0
torchvision == 0.3.0
fasttext == 0.9.1

Details can be found in requirements.txt

Train

Prepare your data

Download the pretrained language model (bin) from here
Update the path in the lib/tools/create_all_synth_lmdb.py
Run the lib/tools/create_all_synth_lmdb.py
Note: it may result in large storage space, you can modify the datasets/dataset.py to generate the word embedding in an online way

Run

Update the path in train.sh, then

sh train.sh

Test

Update the path in the test.sh, then

sh test.sh

Experiments

Evaluation on benchmarks

You can downlod the benchmark datasets from BaiduYun (key: nphk) shared by clovaai in this repo.

Checkpoint	IIIT5K	IC13-1015	IC13-857	IC15-1811	IC15-2077	SVT	SVTP	CUTE
OneDrive BaiduYun(key: x54e)	93.4	93.5	94.5	79.8	75.8	88.4	82.0	84.0

Evalution with lexicons

Existing methods replace the predicted word with the nearest lexicon word under the metric of edit distance (ED). With the semantic information, we can choose the most semantics similar (SS) word based on the nearest edit distance.

Methods	IIIT5K-50	IIIT5K-1K	SVT-50	IC13	IC15
ED	99.06	97.87	96.36	97.44	87.76
ED + SS	99.27	97.93	96.45	97.64	88.07

About the word embedding

Directly use word embedding from the pre-trained LM during training and inference.

IIIT5K	IC13	IC15-1811	IC15-2077	SVT	SVTP	CUTE
94.6	93.8	85.0	79.6	90.9	84.2	85.4

Exploration on global information

We try to use Aggregation Cross-Entropy as the global information instead of the semantics. This part of code will be released in next few days.

IIIT5K	IC13	IC15-1811	IC15-2077	SVT	SVTP	CUTE
93.8	91.3	78.7	-	90.1	81.6	81.9

Citation

@inproceedings{qiao2020seed,
  title={{SEED}: Semantics enhanced encoder-decoder framework for scene text recognition},
  author={Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping},
  booktitle={CVPR},
  year={2020},
}
@article{shi2018aster,
  title={{ASTER}: An attentional scene text recognizer with flexible rectification},
  author={Shi, Baoguang and Yang, Mingkun and Wang, Xinggang and Lyu, Pengyuan and Yao, Cong and Bai, Xiang},
  journal={TPAMI},
  volume={41},
  number={9},
  pages={2035--2048},
  year={2018},
  publisher={IEEE}
}

superpangpang / seed Goto Github PK

seed's Introduction

SE_ASTER

Introduction

How to use

Env

Train

Prepare your data

Run

Test

Experiments

Evaluation on benchmarks

Evalution with lexicons

About the word embedding

Exploration on global information

Citation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent