Giter Club home page Giter Club logo

apisr's Introduction

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

APISR is an image&video upscaler that aims at restoring and enhancing low-quality low-resolution anime images and video sources with various degradations from real-world scenarios.

ArxivHF DemoOpen In ColabHF Demo

🔥 Update | 👀 Visualization | 🔧 Installation | 🏰 Model Zoo |Inference | 🧩 Dataset Curation | 💻 Train

Update 🔥🔥🔥

  • Release Paper version implementation of APISR
  • Release different upscaler factor weight (for 2x, 4x and more)
  • Gradio demo (with online)
  • Provide weight with different architecture (DAT-Small)
  • Add the result combined with Toon Crafter
  • Release the weight trained with Diffusion Generated Images
  • Create a Project Page
  • Some Online Demo for Chinese users && README in Chinese

If you like APISR, please help star this repo. Thanks! 🤗

Visualization (Click them for the best view!) 👀

Toon Crafter Examples Upscale

Please check toon_crafter_upscale

Installation 🔧

git clone [email protected]:Kiteretsu77/APISR.git
cd APISR

# Create conda env
conda create -n APISR python=3.10
conda activate APISR

# Install Pytorch and other packages needed
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt


# To be absolutely sure that the tensorboard can execute. I recommend the following CMD from "https://github.com/pytorch/pytorch/issues/22676#issuecomment-534882021"
pip uninstall tb-nightly tensorboard tensorflow-estimator tensorflow-gpu tf-estimator-nightly
pip install tensorflow

# Install FFMPEG [Only needed for training and dataset curation stage; inference only does not need ffmpeg] (the following is for the linux system, Windows users can download ffmpeg from https://ffmpeg.org/download.html)
sudo apt install ffmpeg

Gradio Fast Inference ⚡⚡⚡

Gradio option doesn't need to prepare the weight from the user side but they can only process one image each time.

Online demo can be found at https://huggingface.co/spaces/HikariDawn/APISR (HuggingFace) or https://colab.research.google.com/github/camenduru/APISR-jupyter/blob/main/APISR_jupyter.ipynb (Colab)

Local Gradio can be created by running the following:

python app.py

Note: Gradio is designed for fast inference, so we will automatically download existing weights and downsample to 720P to ease the VRAM consumption. For a full grounder inference, please check the regular inference section below.

Regular Inference ⚡⚡

  1. Download the model weight from model zoo and put the weight to "pretrained" folder.

  2. Then, Execute (single image/video or a directory mixed with images&videos are all ok!)

    python test_code/inference.py --input_dir XXX  --weight_path XXX  --store_dir XXX

    If the weight you download is paper weight, the default argument of test_code/inference.py is capable of executing sample images from "assets" folder

Dataset Curation 🧩

Our dataset curation pipeline is under dataset_curation_pipeline folder.

You can collect your dataset by sending videos (mp4 or other format) into the pipeline and get the least compressed and the most informative images from the video sources.

  1. Download IC9600 weight (ck.pth) from https://drive.google.com/drive/folders/1N3FSS91e7FkJWUKqT96y_zcsG9CRuIJw and place it at "pretrained/" folder (else, you can define a different --IC9600_pretrained_weight_path in the following collect.py execution)

  2. With a folder with video sources, you can execute the following to get a basic dataset (with ffmpeg installed):

    python dataset_curation_pipeline/collect.py --video_folder_dir XXX --save_dir XXX
  3. Once you get an image dataset with various aspect ratios and resolutions, you can run the following scripts

    Be careful to check uncropped_hr && degrade_hr_dataset_path && train_hr_dataset_path (we will use these path in opt.py setting during training stage)

    In order to decrease memory utilization and increase training efficiency, we pre-process all time-consuming pseudo-GT (train_hr_dataset_path) at the dataset preparation stage.

    But, in order to create a natural input for prediction-oriented compression, in every epoch, the degradation started from the uncropped GT (uncropped_hr), and LR synthetic images are concurrently stored. The cropped HR GT dataset (degrade_hr_dataset_path) and cropped pseudo-GT (train_hr_dataset_path) are fixed in the dataset preparation stage and won't be modified during training.

    Be careful to check if there is any OOM. If there is, it will be impossible to get the correct dataset preparation. Usually, this is because num_workers in scripts/anime_strong_usm.py is too big!

    bash scripts/prepare_datasets.sh

Train 💻

The whole training process can be done in one RTX3090/4090!

  1. Prepare a dataset (AVC / API) that is preprocessed by STEP 2 & 3 in Dataset Curation

    In total, you will have 3 folders prepared before executing the following commands:

    --> uncropped_hr: uncropped GT

    --> degrade_hr_dataset_path: cropped GT

    --> train_hr_dataset_path: cropped Pseudo-GT

  2. Train: Please check opt.py carefully to set the hyperparameters you want (modifying Frequently Changed Setting is usually enough).

    Note1: When you execute the following, we will create a "tmp" folder to hold generated lr images for sanity check. You can modify the code to delete it if you want.

    Note2: If you have a strong CPU, and if you want to accelerate, you can increase parallel_num in the opt.py.

    Step1 (Net L1 loss training): Run

    python train_code/train.py 

    The trained model weights will be inside the folder 'saved_models' (same as checkpoints)

    Step2 (GAN Adversarial Training):

    1. Change opt['architecture'] in opt.py to "GRLGAN" and change batch size if you need. BTW, I don't think that, for personal training, it is needed to train 300K iter for GAN. I did that in order to follow the same setting as in AnimeSR and VQDSR, but 100K ~ 130K should have a decent visual result.

    2. Following previous works, GAN should start from L1 loss pre-trained network, so please carry a pretrained_path (the default path below should be fine)

    python train_code/train.py --pretrained_path saved_models/grl_best_generator.pth 

Citation

Please cite us if our work is useful for your research.

@inproceedings{wang2024apisr,
  title={APISR: Anime Production Inspired Real-World Anime Super-Resolution},
  author={Wang, Boyang and Yang, Fengyu and Yu, Xihang and Zhang, Chao and Zhao, Hanbin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={25574--25584},
  year={2024}
}

Disclaimer

This project is released for academic use only. We disclaim responsibility for the distribution of the model weight and sample images. Users are solely liable for their actions. The project contributors are not legally affiliated with, nor accountable for, users' behaviors.

License

This project is released under the GPL 3.0 license. Also, check the disclaimer.

📧 Contact

If you have any questions, please feel free to contact me at [email protected] or [email protected].

🧩 Projects that use APISR

If you develop/use APISR in your projects, welcome to let me know. I will write all of them here. Thanks!

🤗 Acknowledgement

  • VCISR: My code base is based on my previous paper (WACV 2024).
  • IC9600: The dataset curation pipeline uses IC9600 code to score image complexity level.
  • danbooru-pretrained: Our Anime Dataset (Danbooru) pretrained RESNET50 model.
  • Jupyter Demo: The jupter notebook demo is from camenduru.
  • AVIF&HEIF: The degradation of AVIF and HEFI is from pillow_heif.
  • DAT: The DAT architecture we use for 4x scaling in model zoo is coming from this link.

apisr's People

Contributors

hikaridawn777 avatar kiteretsu77 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

apisr's Issues

About data curation

Dear author:

Please tell me the correct way to set prepare_dataset.sh.

Here I set data/processed which contains the I-frames from the previous step.
However, I got:

scripts/prepare_datasets.sh: line 4: ./data/processed/: Is a directory

Moreover, could you tell me the torch vision version you used?
since the following message shows it has version conflict.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.3.0+cu121 requires torch==2.3.0, but you have torch 2.3.1 which is incompatible.
Successfully installed torch-2.3.1 torchvision-0.18.1 triton-2.3.1

and

Traceback (most recent call last):
  File "/content/drive/MyDrive/GitClone/APISR/scripts/crop_images.py", line 16, in <module>
    from degradation.ESR.usm_sharp import USMSharp
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/usm_sharp.py", line 11, in <module>
    from degradation.ESR.utils import filter2D, np2tensor, tensor2np
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/utils.py", line 18, in <module>
    from degradation.ESR.degradations_functionality import *
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/degradations_functionality.py", line 10, in <module>
    from torchvision.transforms.functional_tensor import rgb_to_grayscale
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'
Traceback (most recent call last):
  File "/content/drive/MyDrive/GitClone/APISR/scripts/crop_images.py", line 16, in <module>
    from degradation.ESR.usm_sharp import USMSharp
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/usm_sharp.py", line 11, in <module>
    from degradation.ESR.utils import filter2D, np2tensor, tensor2np
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/utils.py", line 18, in <module>
    from degradation.ESR.degradations_functionality import *
  File "/content/drive/MyDrive/GitClone/APISR/degradation/ESR/degradations_functionality.py", line 10, in <module>
    from torchvision.transforms.functional_tensor import rgb_to_grayscale
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'

Require a paired dataset

Could you provide a paired dataset(include low-resolution images and high resolutions)?
Google drive link or BaiduDisk link both ok.
Also you can sent it to [email protected] .
Thanks a lot!

请教

请问,本人在预览界面中觉得dat的模型效果更好,但是本人相关知识与水平有限,本地使用时会有报错:

D:\Anaconda\envs\apisr\lib\site-packages\timm\models\layers\__init__.py:49: DeprecationWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
  warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", DeprecationWarning)
Traceback (most recent call last):
  File "D:\Project\Local\APISR-0.3.0\test_code\inference.py", line 117, in <module>
    generator = load_grl(weight_path, scale=scale)  # GRL for Real-World SR only support 4x upscaling
  File "D:\Project\Local\APISR-0.3.0\test_code\test_utils.py", line 163, in load_grl
    generator.load_state_dict(weight)
  File "D:\Anaconda\envs\apisr\lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for GRL:

简单看代码似乎只能使用GRL和RRDB的模型?应该是需要有改动,还请帮忙解答,谢谢

AssertionError

Hi brother, I'm running with this error: when I run the train.py file, there are multiple input folders and output folders in the tmp folder, but it reminds me that only one folder can exist.

This is strange
Process Process-1:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 90, in degradate_process
    self.H264_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/h264.py", line 52, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError
Process Process-4:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 96, in degradate_process
    self.MPEG2_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/mpeg2.py", line 50, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError
This is strange
Process Process-6:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 90, in degradate_process
    self.H264_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/h264.py", line 52, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError
Process Process-5:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 93, in degradate_process
    self.H265_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/h265.py", line 50, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError
This is strange
Process Process-3:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 90, in degradate_process
    self.H264_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/h264.py", line 52, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError
This is strange
Process Process-2:
Traceback (most recent call last):
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/APISR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/APISR/scripts/generate_lr_esr.py", line 100, in single_process
    obj_img.degradate_process(out, opt, store_path, process_id, verbose = False)
  File "/root/miniconda3/envs/APISR/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/APISR/degradation/degradation_esr.py", line 90, in degradate_process
    self.H264_instance.compress_and_store(np_frame, store_path, process_id)
  File "/root/autodl-tmp/APISR/degradation/video_compression/h264.py", line 52, in compress_and_store
    assert(len(os.listdir(temp_store_path)) == 1)
AssertionError

about GPU

Hello, thank you very much for sharing.
Can Sheet 4080 complete the training?

datasets

Can you send me the dataset you used, or provide a download link,Thanks a lot

API Dataset release

Hello, I've had the pleasure of reading your work, and I must say, I found it deeply inspiring. Your efforts are truly commendable!
Would you kindly consider releasing the API Dataset sooner in the future? I'm eagerly looking forward to it.
Thank you very much!

A Proposal for Inferencing High-Resolution Images with limited gpu vram less than 6GB.

We can split the high-resolution image into multiple fixed size patches without overlap, then do inference on each patch, and finally merge the upscaled patches to obtain the full high-resolution image. I have already implemented this, and it is indeed feasible for enabling low VRAM GPUs like RTX3060 Laptop with 6GB VRM to upscale 1080P images. Notably, it seems to have no apparent negative effect on the quality of the upscaled image.
The motivation from vision transformer and your paper, in vision transformer the image is split into multiple patches for tokenization, and in your paper actually train proportion of high resolution image instead of the whole image.
Moreover, I suppose this apporach can also work for accelerating inference with multiple GPUs.

关于训练

感谢您的分享,请问您在什么硬件上进行的训练,一共花费了多久的时间?

Multi GPU support ?

The DAT model can be very heavy, even on a 3090, when a lots of images needs to be upscalled. Is there any chance you could implements multi-gpu in order for a second card to be active ?

I have no clue how to use torch multi-gpu myself.

Thanks.

4x quality not good

This Upscaler does changes OG image for the worse
4x on the right. lips, eyes and flower details got worse
GJZLOLjWEAA2SsU

Help Needed with Missing `ck.pth` File in APISR Project

Hi APISR Team,

I've been diving into APISR and it's awesome.

Here's a little snag I've hit, though. While running dataset_curation_pipeline/collect.py, I bumped into this snag:

FileNotFoundError: [Errno 2] No such file or directory: 'pretrained/ck.pth

I don't seem to have found this CKPT in the repo. Can you provide it?

The code seems to be looking for it right here:

class video_scoring:
    def __init__(self) -> None:

        # Init the model
        self.scorer = ICNet()
        self.scorer.load_state_dict(torch.load('pretrained/ck.pth',map_location=torch.device('cpu')))
        self.scorer.eval().cuda()

Thanks a ton for the help! Can't wait to get back to exploring APISR.

Cheers!

About gpu inference

Why am I inferring this model through rtx4090? Its inference speed is very slow. The model used is DAT. Inference speed takes about five seconds to infer a picture. If you check the gpu, you can confirm that it is the gpu that is being called.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.