Giter Club home page Giter Club logo

tpgsr's Introduction

Text Prior Guided Scene Text Image Super-Resolution (TIP 2023)

https://arxiv.org/abs/2106.15368

Jianqi Ma, Shi Guo, Lei Zhang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China

Recovering TextZoom samples

TPGSR visualization

Environment:

python pytorch cuda numpy MIT

Other possible python packages like pyyaml, cv2, Pillow and imgaug

Main idea

Single stage with loss

Multi-stage version

Configure your training

Download the pretrained recognizer from:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Unzip the codes and walk into the 'TPGSR_ROOT/', place the pretrained weights from recognizer in 'TPGSR_ROOT/'.

Download the TextZoom dataset:

https://github.com/JasonBoy1/TextZoom

Train the corresponding model (e.g. TPGSR-TSRN):

chmod a+x train_TPGSR-TSRN.sh
./train_TPGSR-TSRN.sh
or
python3 main.py --arch="tsrn_tl_cascade" \       # The architecture
                --batch_size=48 \                # The batch size
                --STN \                          # Using STN net for alignment
		--mask \                         # Using the contour mask
		--use_distill \                  # Using the TP loss
		--gradient \                     # Using the Gradient Prior Loss
		--sr_share \                     # Sharing weights for SR Module
		--stu_iter=1 \                   # The number of interations in multi-stage version
		--vis_dir='vis_TPGSR-TSRN' \     # The checkpoint directory

Run the test-prefixed shell to test the corresponding model.

Adding '--go_test' in the shell file

Cite this paper:

@article{ma2021text,
title={Text Prior Guided Scene Text Image Super-resolution},
author={Ma, Jianqi and Guo, Shi and Zhang, Lei},
journal={IEEE Transactions on Image Processing},
year={2023}
}

tpgsr's People

Contributors

mjq11302010044 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

tpgsr's Issues

trained model

hi,
would you please publish your trained model?(.pth file)
due to gpu constraint, i can not train the model.
thanks alot

Errror when run demo ?

I use GPU when inference but i don't know why error . Which one model run on CPU ?

loading pretrained crnn model from crnn.pth
0%| | 0/4 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/main.py", line 76, in
main(config, args, opt_TPG=opt)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/main.py", line 16, in main
Mission.demo()
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/interfaces/super_resolution.py", line 1480, in demo
images_sr = model(images_lr)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/model/tsrn.py", line 195, in forward
spatial_t_emb = self.infoGen(text_emb)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/model/tsrn.py", line 103, in forward
x = F.relu(self.bn1(self.tconv1(t_embedding)))
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 916, in forward
return F.conv_transpose2d(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking arugment for argument weight in method wrapper_slow_conv_transpose2d)

TPGSR-3

我在训练TPGSR-3时发现效果很差,没对代码进行修改只是将--stu_iter改成3,请问您在训练TPGSR-3时采用了什么配置

Where is the final model?

Hi there,

I'd like to reproduce your amazing work but I can only find the pretrained models and not the final fine-tuned model. Am I correct?

Could you please upload the final model?

Thank you.

Code visualize when training error ?

Thank for your work . The paper is great . I read paper and training to understand model . But when i don't know how many epoch model is best so i want to visualize some image when training. But code error, can you show me how to fix it

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

Hi @mjq11302010044,

I was successfully able to train the model, using the code in the repository. But , when I run Test.sh script, I have following error:
"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper_slow_conv_transpose2d)".

I spent almost more than 2 days debugging it, but cannot get past this error.  Can you please help me resolve the issue if you have solution for this?

Regards,
Nakul

Issues about TSRN derived structures!

Hi, Ma, thanks for your nice job! Actually, I got some issues and begging for your early rely.

  1. There are several TSRN derived structures mentioned in the code, like 'sem_tsrn', 'tsrn_c2f', 'tsrn_tl', 'tsrn_tl_cascade', 'tsrn_tl_wmask' etc. But actually, I just reproduced the 'tsrn_tl_cascade' arch successfully. The 'sem_tsrn' arch should be the core arch, isn't it? But why is there no 'sem_tsrn' in the 'args.arch' choices. Unfortunately, I still failed to reproduced it when I added 'sem_tsrn' into the choices of args.arch and set the args.arch=‘sem_tsrn’. Maybe there is something wrong in the released code I guess.

  2. Can you explain the differences in these derived structures like ''tsrn_c2f', 'tsrn_tl_cascade', 'tsrn_tl_wmask' expect the 'data difference' from different arch? Or could you please give some detailed instructions in the README.md. It's a bit hard to understand the purpose of these structures when I read the code.

Thx again!

Request to add a license

Hi Ma,
Great work on the paper and the implementation! I noticed that the repo did not have an license. I was wondering if you could add one so that I can understand the scope of use for the code.

Best,
Jeswin James

about arch

Amazing work! hello, what is the difference between 'sem_tsrn', 'tsrn_c2f', 'tsrn_tl', 'tsrn_tl_cascade', 'tsrn_tl_wmask'?
I want to reproduce your work, which one should be selected?Thanks!

训练时间

你好,我看了你的论文和code,论文描述:The batch size is set to 48 and the model is trained for 500 epochs with one NVIDIA RTX 2080Ti GPU,请问一下跑500epochs大概需要多长时间

共享SR和非共享TP

论文中多阶段训练 提出使用共享SR和非共享TP,但是代码中写的是 非共享SR和共享TP
image
image
根据你提出的训练命令
python3 -u main.py --arch="tsrn_tl_cascade" --batch_size=48 --STN --mask --use_distill --gradient --sr_share --stu_iter=3 --vis_dir='vis_TPGSR-TSRN'
--sr_share 默认为False,训练时是True

RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels

Hello, thank you for your excellent work. I have trained a model and want to process several images with blurred text. The command I use is as follows:
python main.py --arch="tsrn_tl_cascade" --test_model="CRNN" --batch_size=4 --STN --mask --sr_share --gradient --demo --stu_iter=1 --vis_dir='default' --resume=ckpt/vis_TPGSR-TSRN/model_best_0.pth --demo_dir demo

The blurred images are in the demo folder(four jpg images), run and prompt error:
RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead
why?

loading pre-trained model from ckpt/vis_TPGSR-TSRN/model_best_0.pth
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\container.py", line 204, in forward
input = module(input)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead

Add new TP Generator Model

Hi, How can I add a new tp generator and train? Which rows that I have to change for changing tp generator model and what kind of changes that I have to do? Can I obtain label_vecs_final from other text recognition models to give model?

RuntimeError: Tensor for argument #1 'input' is on CPU, Tensor for argument #2 'output' is on CPU, but expected them to be on GPU (while checking arguments for slow_conv_transpose2d_out_cuda)

Hi, when i try to test your model with this command:
python main.py --arch="tsrn_tl_cascade" --test_model="CRNN" --test_data_dir=../TPGSR-main/dataset/TextZoom/test/hard --batch_size=48 --STN --mask --sr_share --gradient --test --stu_iter=1 --vis_dir='hard'
in this line(TPGSR-main\interfaces\super_resolution.py", line 1382, in test):
images_sr = model(images_lr)
i receive this error:
RuntimeError: Tensor for argument #1 'input' is on CPU, Tensor for argument #2 'output' is on CPU, but expected them to be on GPU (while checking arguments for slow_conv_transpose2d_out_cuda)
what should i do?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.