Giter Club home page Giter Club logo

act's Introduction

ACT for SR

This repository provide the code and model of our work:

Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution
Jinsu Yoo1, Taehoon Kim2, Sihaeng Lee2, Seung Hwan Kim2, Honglak Lee2, Tae Hyun Kim1
1Hanyang University, 2LG AI Research
WACV 2023

arXiv

Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-range multi-scale dependencies captured by transformers. Specifically, our network comprises transformer and convolutional branches, which synergetically complement each representation during the restoration procedure. Furthermore, we propose a cross-scale token attention module, allowing the transformer branch to exploit the informative relationships among tokens across different scales efficiently. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.

Table of Contents

Concept

Effective two branch architecture

act

Efficient cross-scale attention

csta

Fusion block to mix representations

fb

How to run

Installation

# Clone this repo
git clone https://github.com/jinsuyoo/act.git
cd act

# Create and activate conda environment
conda env create -f environments.yaml
conda activate act

Prepare dataset

For training, please download imagenet dataset. The dataset should be like:

act
|--- datasets
      |--- imagenet
            |--- train
                  |--- n0xxxxxxx_x.JPEG
                  |--- n0xxxxxxx_0x.JPEG
                  |--- n0xxxxxxx_00x.JPEG
                  |--- ...
            |--- val
                  |--- ILSVR2012_val_000xxxxx.JPEG
                  |--- ILSVR2012_val_000xxxxx.JPEG
                  |--- ILSVR2012_val_000xxxxx.JPEG
                  |--- ...

For test with conventional datasets, some benchmark SR datasets can be downloaded from this repo.

Please place the datasets under './datasets' directory.

Ouick test with pre-trained weights

Following commands will automatically download the pretrained weights. The test results will be saved under './experiments/test/[save_path]'.

python test.py --release
               --task sr 
               --scale [2|3|4]
               --data_test [Set5|Set14|B100|Urban100|Manga109]
               --save_path [PATH TO SAVE THE RESULTS]

# Example) test x2 SR on Set5 dataset with pretrained weight
python test.py --release --task sr --scale 2 --data_test Set5 --save_path act_x2_set5

# Example) test x3 SR on Set14 dataset with pretrained weight
python test.py --release --task sr --scale 3 --data_test Set14 --save_path act_x3_set14

# Example) test x4 SR on B100 dataset with pretrained weight
python test.py --release --task sr --scale 4 --data_test B100 --save_path act_x4_b100

Train model

python train.py --gpus [NUM GPUS]
                --task [sr] 
                --scale [2|3|4]
                --batch_size [BATCH_SIZE_PER_GPU]
                --data_train [ImageNet]
                --data_test [Set14]
                --save_path [PATH TO SAVE THE RESULTS]

# Example) ddp training of x2 SR with 8 gpus
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py --gpus 8 --task sr --scale 2 --batch_size 64 --data_test Set14 --save act_sr_x2 

Test model

python test.py --task sr 
               --scale [2|3|4]
               --data_test [Set5|Set14|B100|Urban100|Manga109]
               --ckpt_path [PATH TO YOUR CHECKPOINT]
               --save_path [PATH TO SAVE THE RESULTS]

Citation

If you find our work useful in your research, please consider citing our paper:

@inproceedings{yoo2023act,
  title={Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution},
  author={Yoo, Jinsu and Kim, Taehoon and Lee, Sihaeng and Kim, Seung Hwan and Lee, Honglak and Kim, Tae Hyun},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2023}
}

Acknowledgement

The codes are based on:

Thanks for open sourcing such a wonderful works!

act's People

Contributors

jinsuyoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

act's Issues

About Head Module

Thank you for your great work, but the Head structure you used is mentioned in the IPT 、EDSR, and what is the difference between this head and just a 3*3 convolutional layer?

test problem

QQ图片20240311092459

I encountered this problem during testing and I don’t know how to solve it.

为什么测试结果和paper不一样

100%|█████████████████████████████████████████████| 5/5 [00:01<00:00, 2.91it/s]
[Set5 x4] PSNR: 30.190
100%|███████████████████████████████████████████| 14/14 [00:01<00:00, 8.81it/s]
[Set14 x4] PSNR: 27.732
100%|█████████████████████████████████████████| 100/100 [00:21<00:00, 4.62it/s]
[Urban100 x4] PSNR: 26.643

How to resume the training?

My experiment have saved two checkpoint files, one named last-v1.ckpt and one named last.ckpt. Now I am wondering about which one I should to choose for the resume train. Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.