Giter Club home page Giter Club logo

ccsr's Introduction

Improving the Stability of Diffusion Models for Content Consistent Super-Resolution

Open in Colab Replicate

Lingchen Sun1,2 | Rongyuan Wu1,2 | Zhengqiang Zhang1,2 | Hongwei Yong1 | Lei Zhang1,2

1The Hong Kong Polytechnic University, 2OPPO Research Institute

⏰ Update

  • 2024.1.17: Add Replicate demo Replicate.
  • 2024.1.16: Add Gradio demo.
  • 2024.1.14: Integrate tile_diffusion and tile_vae to the inference_ccsr_tile.py to save the GPU memory for inference.
  • 2024.1.10: Update CCSR colab demo. ❤ Thank camenduru for the implementation!
  • 2024.1.4: Code and the model for real-world SR are released.
  • 2024.1.3: Paper is released.
  • 2023.12.23: Repo is released.

⭐ If CCSR is helpful to your images or projects, please help star this repo. Thanks! 🤗

🌟 Overview Framework

ccsr

😍 Visual Results

Demo on Real-World SR

Comparisons on Real-World SR

For the diffusion model-based method, two restored images that have the best and worst PSNR values over 10 runs are shown for a more comprehensive and fair comparison.

ccsr

Comparisons on Bicubic SR

ccsr For more comparisons, please refer to our paper for details.

📝 Quantitative comparisons

We propose new stability metrics, namely global standard deviation (G-STD) and local standard deviation (L-STD), to respectively measure the image-level and pixel-level variations of the SR results of diffusion-based methods.

More details about G-STD and L-STD can be found in our paper.

ccsr

⚙ Dependencies and Installation

## git clone this repository
git clone https://github.com/csslc/CCSR.git
cd CCSR

# create an environment with python >= 3.9
conda create -n ccsr python=3.9
conda activate ccsr
pip install -r requirements.txt
pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers

🍭 Quick Inference

Step 1: Download the pretrained models

  • Download the CCSR models from:
Model Name Description GoogleDrive BaiduNetdisk
real-world_ccsr.ckpt CCSR model for real-world image restoration. download download (pwd: CCSR)
bicubic_ccsr.ckpt CCSR model for bicubic image restoration. download download

Step 2: Prepare testing data

You can put the testing images in the preset/test_datasets.

Step 3: Running testing command

python inference_ccsr.py \
--input preset/test_datasets \
--config configs/model/ccsr_stage2.yaml \
--ckpt weights/real-world_ccsr.ckpt \
--steps 45 \
--sr_scale 4 \
--t_max 0.6667 \
--t_min 0.3333 \
--color_fix_type adain \
--output experiments/test \
--device cuda \
--repeat_times 1 

We integrate tile_diffusion and tile_vae to the inference_ccsr_tile.py to save the GPU memory for inference. You can change the tile size and stride according to the VRAM of your device.

python inference_ccsr_tile.py \
--input preset/test_datasets \
--config configs/model/ccsr_stage2.yaml \
--ckpt weights/real-world_ccsr.ckpt \
--steps 45 \
--sr_scale 4 \
--t_max 0.6667 \
--t_min 0.3333 \
--tile_diffusion \
--tile_diffusion_size 512 \
--tile_diffusion_stride 256 \
--tile_vae \
--vae_decoder_tile_size 224 \
--vae_encoder_tile_size 1024 \
--color_fix_type adain \
--output experiments/test \
--device cuda \
--repeat_times 1

You can obtain N different SR results by setting repeat_time as N to test the stability of CCSR. The data folder should be like this:

 experiments/test
 ├── sample0   # the first group of SR results 
 └── sample1   # the second group of SR results 
   ...
 └── sampleN   # the N-th group of SR results 

Gradio Demo

Download the model real-world_ccsr.ckpt and put the model to weights/, then run the following command to interact with the gradio website.

python gradio_ccsr.py \
--ckpt weights/real-world_ccsr.ckpt \
--config configs/model/ccsr_stage2.yaml \
--device cuda

ccsr

📏 Evaluation

  1. Calculate the Image Quality Assessment for each restored group.

    Fill in the required information in cal_iqa.py and run, then you can obtain the evaluation results in the folder like this:

     log_path
     ├── log_name_npy  # save the IQA values of each restored group as the npy files
     └── log_name.log   # log recode
    
  2. Calculate the G-STD value for the diffusion-based SR method.

    Fill in the required information in iqa_G-STD.py and run, then you can obtain the mean IQA values of N restored groups and G-STD value.

  3. Calculate the L-STD value for the diffusion-based SR method.

    Fill in the required information in iqa_L-STD.py and run, then you can obtain the L-STD value.

🚋 Train

Step1: Prepare training data

  1. Generate file list of training set and validation set.

    python scripts/make_file_list.py \
    --img_folder [hq_dir_path] \
    --val_size [validation_set_size] \
    --save_folder [save_dir_path] \
    --follow_links

    This script will collect all image files in img_folder and split them into training set and validation set automatically. You will get two file lists in save_folder, each line in a file list contains an absolute path of an image file:

    save_dir_path
    ├── train.list # training file list
    └── val.list   # validation file list
    
  2. Configure training set and validation set.

    For real-world image restoration, fill in the following configuration files with appropriate values.

Step2: Train Stage1 Model

  1. Download pretrained Stable Diffusion v2.1 to provide generative capabilities.

    wget https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt --no-check-certificate
  2. Create the initial model weights.

    python scripts/make_stage2_init_weight.py \
    --cldm_config configs/model/ccsr_stage1.yaml \
    --sd_weight [sd_v2.1_ckpt_path] \
    --output weights/init_weight_ccsr.ckpt
  3. Configure training-related information.

    Fill in the configuration file of training of stage1 with appropriate settings.

  4. Start training.

    python train.py --config configs/train_ccsr_stage1.yaml

Step3: Train Stage2 Model

  1. Configure training-related information.

    Fill in the configuration file of training of stage2 with appropriate settings.

  2. Start training.

     python train.py --config configs/train_ccsr_stage2.yaml

Citations

If our code helps your research or work, please consider citing our paper. The following are BibTeX references:

@article{sun2023ccsr,
  title={Improving the Stability of Diffusion Models for Content Consistent Super-Resolution},
  author={Sun, Lingchen and Wu, Rongyuan and Zhang, Zhengqiang and Yong, Hongwei and Zhang, Lei},
  journal={arXiv preprint arXiv:2401.00877},
  year={2024}
}

License

This project is released under the Apache 2.0 license.

Acknowledgement

This project is based on ControlNet, BasicSR and DiffBIR. Some codes are brought from StableSR. Thanks for their awesome works.

Contact

If you have any questions, please contact: [email protected]

statistics

visitors

ccsr's People

Contributors

chenxwh avatar csslc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ccsr's Issues

Delete

delete. Wrong repo. Sorry!

librairy pytorch_lightning.utilities.distributed problem

Issue Description

Hi,

After creating the ccrsr virtual environment and running python3 inference_ccsr.py, I encountered the following issue:

ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

Environment Information

  • pytorch-lightning Version: 2.1.3
  • torch Version: 2.0.1
  • Python Version: 3.9.18

Resolution

To resolve the issue, I made the following modification in the code:

In CCSR/ldm/models/diffusion/ddpm_ccsr_stage2.py and /home/pierre/CCSR/ldm/models/diffusion/ddpm_ccsr_stage1.py, I changed:

from pytorch_lightning.utilities.distributed import rank_zero_only

to:

from pytorch_lightning.utilities.rank_zero import rank_zero_only

This modification allowed me to make it work.

Steps to Reproduce

  1. Create ccrsr virtual environment.
  2. Run python3 inference_ccsr.py.

Colab run on T4

Hello, this is a great work!
I can't run inference code on colab with T4
image
Could you help me, please?

Use case of CCSR compared to SeeSR

Please can you comment on when one would use CCSR vs SeeSR?

Both appear to have similar objectives? How does CCSR perform vs SeeSR?

Thanks

Xt_min -> X0

非常出色的工作!

请问论文中 Xt_min -> X0 这一步的过程中,论文中说使用的是 truncated output,请问这个过程具体是什么样的?我看在论文中好像没有详细介绍,对着一部分非常好奇。

training decoder during Stage 2

Thanks for the wonderful work! I noticed you only train 100 iterations for decoder, while 25k iterations for SD.
Is it enough to train decoder for only 100 iterations?
BTW, did you use combination of L1 loss, perceptual loss, and GAN loss to train decoder?

Thanks in advance!

Training with your own lr data

Hello, thanks a lot for your work!

I currently work on grayscale CT-images super-resolution. As I went through both steps of the training process using pre-trained stable diffusion the results turned out to be far from desired.
I have a dataset of paired lq and hq images that I want to use to train stage1 and stage2 models.
Is it possible to do this in the current design with minor additions to the code?
Will the use of a SwinIR model that was pretrained with my dataset, improve the quality of generation?

trian model CUDA out of memory

Is there any way to train on 24G on a GTX3090, even with one batch size?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 3; 23.69 GiB total capacity; 23.03 GiB already allocated; 21.69 MiB free; 23.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Epoch 0: 0%| | 2/35135 [00:29<144:16:07, 14.78s/it, loss=0.389, v_num=0, train/loss_simple_step=0.131, train/loss_vlb_step=0.000475, train/loss_step=0.131, global_step=0.000, train/loss_x0_step=0.335, train/loss_x0_from_tao_step=0.366, train/loss_noise_from_tao_step=0.00291, train/loss_net_step=0.704]

Question About Figure.1

你好!
我看到论文中Figure.1 的左图,关于时间步和PSNR与LPISP的关系感觉很有意思,很有洞见的发现。

但是这整个迭代过程是复杂的。从1000-0的时间步上,是从纯噪音到SR的过程。其中的每一步中间结果,其实包括了两部分:图像和噪音。整个过程也就是:图像质量逐渐提高,噪音逐渐减少。我们是想评估图像质量的变化,但是噪音会干扰这个测试。

请问你测试中间步的结果,是如何减少噪音对指标的影响?

因为我测了一下两个好像都是单调的曲线哎?更够提供更多的细节吗?

Does Colab Work ? Error Occured.

When I run CCSR colab demo that can be found in readme.md , Error occured like below.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xformers 0.0.22.post7 requires torch==2.1.0, but you have torch 2.2.1 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.2.1+cu121 requires torch==2.2.1, but you have torch 2.1.0 which is incompatible.
torchtext 0.17.1 requires torch==2.2.1, but you have torch 2.1.0 which is incompatible.
torchvision 0.17.1+cu121 requires torch==2.2.1, but you have torch 2.1.0 which is incompatible.

error

Am I missing Something ?

Issues with `smallkF` in xFormers: CUDA Support and Operator Build Errors

I am encountering multiple issues with smallkF in the xFormers library. The problems seem to arise from a combination of CUDA support, operator build, and embed per head size. Below are the specific error messages and my current setup:

`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    xFormers wasn't build with CUDA support
    operator wasn't built - see `python -m xformers.info` for more info
    unsupported embed per head: 512

why use one-step sampling

很棒的工作,感谢作者分享!

有个小问题想请教一下:请问推理过程中的x_T是纯噪声吗,x_T到x_tmax这一步的作用是什么?为什么不直接用LR加对应噪声得到x_tmax, 或者x_T逐步去噪直到x_tmin?

Will this work on dual GPU?

Having VRAM issues with 24GB even when tiling by a lot. If I buy another GPU, can I use them both to fix this and if so, how do I enable it?

Or can this code be modified to use FP16 instead of FP32, if that will reduce memory usage?

I could resolved the issue successfully by executing the following commands:

          I could resolved the issue successfully by executing the following commands:

pip uninstall pytorch-lightning torch torchvision torchmetrics

pip install torch==2.0.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html pip install torchvision==0.15.2+cu118 -f https://download.pytorch.org/whl/torch_stable.html pip install torchmetrics==0.6.0 pip install pytorch-lightning==1.4.2

pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu118

Originally posted by @Limbicnation in #10 (comment)

questions about metrics

Hi author, thanks for your team's contribution.

I would like to ask you a question about calculating the metrics during the training process. Specifically, the training process is usually interspersed with a validation step, do you perform the computation of the evaluation metrics during the validation step, which seems to be time consuming. So I'm wondering how you schedule the evaluation during the training process?

ModuleNotFoundError: No module named 'utils.devices'

感谢你们的工作!
当我部署最新的脚本(2024-1-15)时,遇到了如题的错误。看了下脚本结构,确实也没发现utils.devices。
不知道是不是我遗漏了什么。
感谢!

Stage2 training: errors while loading ckpt from Stage1 training

Got the following error when training stage2
Missing key(s) in state_dict: "betas_inter", "alphas_cumprod_inter", "alphas_cumprod_prev_inter", "sqrt_alphas_cumprod_inter", "sqrt_one_minus_alphas_cumprod_inter", "log_one_minus_alphas_cumprod_inter", "sqrt_recip_alphas_cumprod_inter", "sqrt_recipm1_alphas_cumprod_inter", "posterior_variance_inter", "posterior_log_variance_clipped_inter", "posterior_mean_coef1_inter", "posterior_mean_coef2_inter", "decoder_loss.logvar", "decoder_loss.perceptual_loss.scaling_layer.shift", "decoder_loss.perceptual_loss.scaling_layer.scale", "decoder_loss.perceptual_loss.net.slice1.0.weight", "decoder_loss.perceptual_loss.net.slice1.0.bias", "decoder_loss.perceptual_loss.net.slice1.2.weight", "decoder_loss.perceptual_loss.net.slice1.2.bias...

After checking the difference between stage1 training and stage2 training, it seems that this error is valid because these are modules appears in ddpm_ccsr_stage2.py but not in ddpm_ccsr_stage1.py. I'm wondering if I miss anything, or the load_state_dict() for stage2 training should simpled call with strick==False

Great work! Really appreaciate sharing the details!

Line at bottom

Have noticed a line at the bottom of the images.

image

Unsure why this happens. Using default settings.

No module named Taming found

How can I fix this error?

File "importlib_init_.py", line 126, in import_module
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in call_with_frames_removed
File "D:\ComfyUI1\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-CCSR\ldm\modules\losses_init
.py", line 1, in
from ....ldm.modules.losses.contperceptual import LPIPSWithDiscriminator
File "D:\ComfyUI1\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-CCSR\ldm\modules\losses\contperceptual.py", line 4, in
from taming.modules.losses.vqperceptual import * # TODO: taming dependency yes/no?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'taming'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.