chenwu98 / cycle-diffusion Goto Github PK

[ICCV 2023] A latent space for stochastic diffusion models

License: Other

Python 100.00%

diffusion-models generative-models image-to-image-translation score-based-generative-models stable-diffusion text-to-image zero-shot-learning image-synthesis

cycle-diffusion's Introduction

CycleDiffusion

Official PyTorch implementation of our paper:
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
Chen Henry Wu, Fernando De la Torre
Carnegie Mellon University
Preprint, Oct 2022

A modified version of this paper is accepted to ICCV 2023:
A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance
Chen Henry Wu, Fernando De la Torre
Carnegie Mellon University
ICCV 2023

[Paper link] | [ICCV version] | [Diffusers 🧨 implementation] | [HuggingFace 🤗 demo]

Updates

[Oct 13 2022] Code released. Section 4.3 of the earliest ArXiv version is open-sourced at Unified Generative Zoo.

[Nov 9 2022] CycleDiffusion is now available as a pipeline on HuggingFace 🤗 Diffusers 🧨. Please check the pipeline doc.

[Nov 10 2022] A demo built with HuggingFace 🤗 Spaces is available at Stable CycleDiffusion.

Overview

We think the randomness in diffusion models is like magic! Accumulated evidence has shown that fixing the "random seed" helps diffusion models generate images from two image distributions with minimal differences. Our paper is exactly about how to formalize this "random seed" and how to infer it from a given real image.

Our formalization and derivation are purely by definition, while we show that some amazing consequences follow! This repository contains code for CycleDiffusion, an embarrassingly simple method capable of

Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.

Check our results on zero-shot image-to-image translation below! We formulate the task input as a triplet $(\boldsymbol{x}, \boldsymbol{t}, \hat{\boldsymbol{t}})$:

$\boldsymbol{x}$ is the source image, displayed with a purple margin.
$\boldsymbol{t}$ is the source text, with text spans marked in purple.
$\hat{\boldsymbol{t}}$ is the target text, with text spans abbreviated as $[\ldots]$ if overlapped with the source text.

We used Stable Diffusion in our experiments. Notably, all source images $\boldsymbol{x}$ are real images! Yes, you find that some of them are generated by DALL∙E 2, but these images can be seen as real for Stable Diffusion :)

Here are some comparisons with baselines.

CycleDiffusion

Dependencies

Create an environment by running

conda env create -f environment.yml
conda activate generative_prompt
pip install git+https://github.com/openai/CLIP.git

Install torch and torchvision based on your CUDA version.
Install taming-transformers by running

cd ../
git clone [email protected]:CompVis/taming-transformers.git
cd taming-transformers/
pip install -e .
cd ../

Set up wandb for logging (registration is required). You should modify the setup_wandb function in main.py to accomodate your wandb credentials. You may want to run something like

wandb login

Evaluation data

Most data for zero-shot image-to-image translation are already included in data/. Some images are from the AFHQ validation set, detailed below.
Prepare the AFHQ validation set for unpaired image-to-image translation (also for some images used by zero-shot image-to-image translation) by running

git clone [email protected]:clovaai/stargan-v2.git
cd stargan-v2/
bash download.sh afhq-v2-dataset

Pre-trained diffusion models

Stable Diffusion

cd ckpts/
mkdir stable_diffusion
cd stable_diffusion/
# Download pre-trained checkpoints for Stable Diffusion here.
# You should download this version: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original
# Due to licence issues, we cannot share the pre-trained checkpoints directly.

Latent Diffusion Model

cd ckpts/
wget https://www.dropbox.com/s/9lpdgs83l7tjk6c/ldm_models.zip
unzip ldm_models.zip
cd ldm_models/
mkdir text2img-large
cd text2img-large/
wget https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
wget https://www.dropbox.com/s/7pdttimz78ll0km/txt2img-1p4B-eval.yaml

DDPM (AFHQ-Dog and FFHQ are from ILVR; CelebAHQ is from SDEdit; AFHQ-Cat and -Wild are trained by ourselves)

cd ckpts/
mkdir ddpm
cd ddpm/
# Update Aug 4, 2023: it seems that the link below, originally from SDEdit, is broken. Please find other sources for CelebA-HQ (cf. issue #24)
wget https://image-editing-test-12345.s3-us-west-2.amazonaws.com/checkpoints/celeba_hq.ckpt
wget https://www.dropbox.com/s/g4h8sv07i3hj83d/ffhq_10m.pt
wget https://www.dropbox.com/s/u74w8vaw1f8lc4k/afhq_dog_4m.pt
wget https://www.dropbox.com/s/8i5aznjwdl3b5iq/cat_ema_0.9999_050000.pt
wget https://www.dropbox.com/s/tplximipy8zxaub/wild_ema_0.9999_050000.pt
wget https://www.dropbox.com/s/vqm6bxj0zslrjxv/configs.zip
unzip configs.zip

Usage

Zero-shot image-to-image translation with text-to-image diffusion models

Zero-shot image-to-image translation with Stable Diffusion v1-4. We divided the 128 test samples into 8 groups (16 samples in each group), so the averaged metrics are reported.

export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1405 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_2
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1424 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=2
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_3
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1423 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=3
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_4
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1422 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=4
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_5
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1429 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_6
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1428 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=6
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_7
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1427 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=7
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_8
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1426 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Zero-shot image-to-image translation with the LDM text-to-image checkpoint. We divided the 128 test samples into 8 groups, so the averaged metrics are reported.

export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_latentdiff_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1465 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_text2img256_latentdiff_stochastic_2
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1485 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=2
export RUN_NAME=translate_text2img256_latentdiff_stochastic_3
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1486 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=3
export RUN_NAME=translate_text2img256_latentdiff_stochastic_4
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1487 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=4
export RUN_NAME=translate_text2img256_latentdiff_stochastic_5
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1488 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_text2img256_latentdiff_stochastic_6
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1489 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=6
export RUN_NAME=translate_text2img256_latentdiff_stochastic_7
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1411 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=7
export RUN_NAME=translate_text2img256_latentdiff_stochastic_8
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1412 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Customized use for zero-shot image-to-image translation

Add your own image path and source-target text pairs at the end of this json file. You can add as many as you want.
Suggested hyperparameter tuning in this config file

decoder_unconditional_guidance_scales: larger value means more weight on the target text
skip_steps: larger value means more similar to the original image
random seed: different random seed will generate different results

Note that each combination of decoder_unconditional_guidance_scales $\times$ skip_steps will be enumerated, and the best one is returned.
Run the following command to generate the image. Outputs will be saved in the output folder.

export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_custom
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1426 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Unpaired image-to-image translation with diffusion models trained on two domains

AFHQ-Cat to AFHQ-Dog with DDIM $\eta=0.1$

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_afhqcat256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1446 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

AFHQ-Wild to AFHQ-Dog with DDIM $\eta=0.1$

export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_afhqwild256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1498 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Citation

If you find this repository helpful, please cite it as

@inproceedings{cyclediffusion,
  title={Unifying Diffusion Models' Latent Space, with Applications to {CycleDiffusion} and Guidance},
  author={Chen Henry Wu and Fernando De la Torre},
  booktitle={ArXiv},
  year={2022},
}

@inproceedings{cyclediffusion,
  title={A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance},
  author={Chen Henry Wu and Fernando De la Torre},
  booktitle={ICCV},
  year={2023},
}

License

We use the X11 License. This license is identical to the MIT License, but with an extra sentence that prohibits using the copyright holders' names (Carnegie Mellon University in our case) for advertising or promotional purposes without written permission.

Contact

Issues are welcome if you have any questions about the code. If you would like to discuss the method, please contact Chen Henry Wu.

cycle-diffusion's People

Contributors

Stargazers

Watchers

cycle-diffusion's Issues

whether the early stop should be applied in unpaired I2I ?

Hi, Chen
I noticed that the early stop should not be used according C.2 in the supplementary.

However, in your implementation, the early stop is set as 850

ldm_models.zip Corrupt files，损坏的文件

An error occurred while decompressing：
解压的时候出现了错误：

(ldm) home/InST-main/cycle-diffusion-main/ckpts$ unzip ldm_models.zip
Archive: ldm_models.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of ldm_models.zip or
ldm_models.zip.zip, and cannot find ldm_models.zip.ZIP, period.

install envionment issue

C:\Github_Code\GAN\cycle-diffusion>conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

pathlib2==2.3.6=py39h06a4308_2
cairo==1.16.0=hf32fb01_1
spyder-kernels==2.1.3=py39h06a4308_0
_ipyw_jlab_nb_ext_conf==0.1.0=py39h06a4308_0
yaml==0.2.5=h7b6447c_0
intel-openmp==2021.4.0=h06a4308_3561
locket==0.2.1=py39h06a4308_1
pywavelets==1.1.1=py39h6323ea4_4
openssl==1.1.1l=h7f8727e_0
conda-package-handling==1.7.3=py39h27cfd23_1
libev==4.33=h7f8727e_1
pluggy==0.13.1=py39h06a4308_0
ca-certificates==2021.10.26=h06a4308_2
importlib-metadata==4.8.1=py39h06a4308_0
libuv==1.40.0=h7b6447c_0
markupsafe==1.1.1=py39h27cfd23_0
statsmodels==0.12.2=py39h27cfd23_0
ld_impl_linux-64==2.35.1=h7274673_9
zope==1.0=py39h06a4308_1
xz==5.2.5=h7b6447c_0
pep8==1.7.1=py39h06a4308_0
mpfr==4.0.2=hb69a4c5_1
pyjwt==2.1.0=py39h06a4308_0
unixodbc==2.3.9=h7b6447c_0
fontconfig==2.13.1=h6c09931_0
widgetsnbextension==3.5.1=py39h06a4308_0
pytest==6.2.4=py39h06a4308_2
gstreamer==1.14.0=h28cd5cc_2
mistune==0.8.4=py39h27cfd23_1000
conda==4.10.3=py39h06a4308_0
scipy==1.7.1=py39h292c36d_2
sqlite==3.36.0=hc218d9a_0
navigator-updater==0.2.1=py39h06a4308_0
astroid==2.6.6=py39h06a4308_0
libstdcxx-ng==9.3.0=hd4cf53a_17
wurlitzer==2.1.1=py39h06a4308_0
pytables==3.6.1=py39h77479fe_1
py-lief==0.10.1=py39h2531618_1
libcurl==7.78.0=h0b77cf5_0
argon2-cffi==20.1.0=py39h27cfd23_1
cryptography==3.4.8=py39hd23ed53_0
typed-ast==1.4.3=py39h7f8727e_1
libxml2==2.9.12=h03d6c58_0
secretstorage==3.3.1=py39h06a4308_0
certifi==2021.10.8=py39h06a4308_0
zlib==1.2.11=h7b6447c_3
lerc==3.0=h295c915_0
chardet==4.0.0=py39h06a4308_1003
terminado==0.9.4=py39h06a4308_0
tbb4py==2021.4.0=py39hd09550d_0
setuptools==58.0.4=py39h06a4308_0
python==3.9.7=h12debd9_1
expat==2.4.1=h2531618_2
debugpy==1.4.1=py39h295c915_0
graphite2==1.3.14=h23475e2_0
kiwisolver==1.3.1=py39h2531618_0
boto==2.49.0=py39h06a4308_0
mpich==3.3.2=hc856adb_0
zfp==0.5.5=h2531618_6
krb5==1.19.2=hac12032_0
libtool==2.4.6=h7b6447c_1005
regex==2021.8.3=py39h7f8727e_0
jedi==0.18.0=py39h06a4308_1
sphinxcontrib==1.0=py39h06a4308_1
scikit-image==0.18.3=py39h51133e4_0
mkl_fft==1.3.1=py39hd3c417c_0
bzip2==1.0.8=h7b6447c_0
cycler==0.10.0=py39h06a4308_0
webencodings==0.5.1=py39h06a4308_1
arrow==0.13.1=py39h06a4308_0
ply==3.11=py39h06a4308_0
libarchive==3.4.2=h62408e4_0
bottleneck==1.3.2=py39hdd57654_1
mkl_random==1.2.2=py39h51133e4_0
msgpack-python==1.0.2=py39hff7bd54_1
wrapt==1.12.1=py39he8ac12f_1
giflib==5.2.1=h7b6447c_0
pyqt==5.9.2=py39h2531618_6
libpng==1.6.37=hbc83047_0
pyyaml==6.0=py39h7f8727e_1
zstd==1.4.9=haebb681_0
cffi==1.14.6=py39h400218f_0
libgfortran-ng==7.5.0=ha8ba4b0_17
pandas==1.3.4=py39h8c16a72_0
pyzmq==22.2.1=py39h295c915_1
charls==2.2.0=h2531618_0
libaec==1.0.4=he6710b0_1
pandocfilters==1.4.3=py39h06a4308_1
brotli==1.0.9=he6710b0_2
xlwt==1.3.0=py39h06a4308_0
zeromq==4.3.4=h2531618_0
anyio==2.2.0=py39h06a4308_1
libwebp==1.2.0=h89dd481_0
docutils==0.17.1=py39h06a4308_1
jxrlib==1.1=h7b6447c_2
libnghttp2==1.41.0=hf8bcb03_2
daal4py==2021.3.0=py39hae6d005_0
mypy_extensions==0.4.3=py39h06a4308_0
greenlet==1.1.1=py39h295c915_0
sniffio==1.2.0=py39h06a4308_1
matplotlib==3.4.3=py39h06a4308_0
pylint==2.9.6=py39h06a4308_1
ujson==4.0.2=py39h2531618_0
pyrsistent==0.18.0=py39heee7806_0
clyent==1.2.2=py39h06a4308_1
libssh2==1.9.0=h1ba5d50_1
tbb==2021.4.0=hd09550d_0
openjpeg==2.4.0=h3ad879b_0
mpmath==1.2.1=py39h06a4308_0
tornado==6.1=py39h27cfd23_0
gmpy2==2.0.8=py39h8083e48_3
watchdog==2.1.3=py39h06a4308_0
patchelf==0.13=h295c915_0
pycosat==0.6.3=py39h27cfd23_0
keyring==23.1.0=py39h06a4308_0
ncurses==6.3=heee7806_1
dbus==1.13.18=hb2f20db_0
jupyter==1.0.0=py39h06a4308_7
readline==8.1=h27cfd23_0
pip==21.2.4=py39h06a4308_0
spyder==5.1.5=py39h06a4308_1
numpy-base==1.20.3=py39h74d4b33_0
icu==58.2=he6710b0_3
ipykernel==6.4.1=py39h06a4308_1
psutil==5.8.0=py39h27cfd23_1
jupyter_server==1.4.1=py39h06a4308_0
sip==4.19.13=py39h2531618_0
libzopfli==1.0.3=he6710b0_0
qt==5.9.7=h5867ecd_1
libgfortran4==7.5.0=ha8ba4b0_17
blosc==1.21.0=h8c45485_0
pyerfa==2.0.0=py39h27cfd23_0
matplotlib-base==3.4.3=py39hbbc1b5f_0
cfitsio==3.470=hf0d0db6_6
llvmlite==0.37.0=py39h295c915_1
libffi==3.3=he6710b0_2
fastcache==1.1.0=py39he8ac12f_0
bitarray==2.3.0=py39h7f8727e_1
snappy==1.1.8=he6710b0_0
freetype==2.10.4=h5ab3b9f_0
jbig==2.1=hdba287a_0
libxslt==1.1.34=hc22bd24_0
libtiff==4.2.0=h85742a9_0
libxcb==1.14=h7b6447c_0
multipledispatch==0.6.0=py39h06a4308_0
brotlipy==0.7.0=py39h27cfd23_1003
glib==2.69.1=h5202010_0
unicodecsv==0.14.1=py39h06a4308_0
brunsli==0.1=h2531618_0
ipython==7.29.0=py39hb070fc8_0
tk==8.6.11=h1ccaba5_0
nbconvert==6.1.0=py39h06a4308_0
pycurl==7.44.1=py39h8f2d780_1
future==0.18.2=py39h06a4308_1
mkl==2021.4.0=h06a4308_640
lcms2==2.12=h3be6417_0
h5py==3.3.0=py39h930cdd6_0
inflection==0.5.1=py39h06a4308_0
get_terminal_size==1.0.0=haa9412d_0
imagecodecs==2021.8.26=py39h4cda21f_0
zope.interface==5.4.0=py39h7f8727e_0
numpy==1.20.3=py39hf144106_0
ruamel_yaml==0.15.100=py39h27cfd23_0
numba==0.54.1=py39h51133e4_0
curl==7.78.0=h1ccaba5_0
lxml==4.6.3=py39h9120a33_0
libdeflate==1.8=h7f8727e_5
libedit==3.1.20210910=h7f8727e_0
anaconda-client==1.9.0=py39h06a4308_0
harfbuzz==2.8.1=h6f93f22_0
path==16.0.0=py39h06a4308_0
cython==0.29.24=py39hdbfa776_0
lz4-c==1.9.3=h295c915_1
sympy==1.9=py39h06a4308_0
bkcharts==0.2=py39h06a4308_0
argh==0.26.2=py39h06a4308_0
mpi==1.0=mpich
numexpr==2.7.3=py39h22e1b3c_1
hdf5==1.10.6=hb1b8bf9_0
pillow==8.4.0=py39h5aabda8_0
scikit-learn==0.24.2=py39ha9443f7_0
cytoolz==0.11.0=py39h27cfd23_0
mkl-service==2.4.0=py39h7f8727e_0
pysocks==1.7.1=py39h06a4308_0
libgcc-ng==9.3.0=h5101ec6_17
distributed==2021.10.0=py39h06a4308_0
et_xmlfile==1.1.0=py39h06a4308_0
gmp==6.2.1=h2531618_2
gst-plugins-base==1.14.0=h8213a91_2
liblief==0.10.1=h2531618_1
notebook==6.4.5=py39h06a4308_0
libsodium==1.0.18=h7b6447c_0
dal==2021.3.0=h06a4308_557
libspatialindex==1.9.3=h2531618_0
libuuid==1.0.3=h7f8727e_2
zope.event==4.5.0=py39h06a4308_0
sqlalchemy==1.4.22=py39h7f8727e_0
fribidi==1.0.10=h7b6447c_0
pixman==0.40.0=h7f8727e_1
mpc==1.1.0=h10f8cd9_1
astropy==4.3.1=py39h09021b7_0
mccabe==0.6.1=py39h06a4308_1
entrypoints==0.3=py39h06a4308_0
lazy-object-proxy==1.6.0=py39h27cfd23_0
pyodbc==4.0.31=py39h295c915_0
bokeh==2.4.1=py39h06a4308_0
conda-build==3.21.5=py39h06a4308_0
pcre==8.45=h295c915_0
libgomp==9.3.0=h5101ec6_17
libllvm11==11.1.0=h3826bc1_0
gevent==21.8.0=py39h7f8727e_1
simplegeneric==0.8.1=py39h06a4308_2
scikit-learn-intelex==2021.3.0=py39h06a4308_0
jupyter_core==4.8.1=py39h06a4308_0
jpeg==9d=h7f8727e_0
pkginfo==1.7.1=py39h06a4308_0
patsy==0.5.2=py39h06a4308_0
rtree==0.9.7=py39h06a4308_1
libwebp-base==1.2.0=h27cfd23_0
_openmp_mutex==4.5=1_gnu
pango==1.45.3=hd140c19_0
c-ares==1.17.1=h27cfd23_0
lzo==2.10=h7b6447c_2
C:\Github_Code\GAN\cycle-diffusion>

my anaconada version is conda 23.7.4
OS: window 11
GPU: RTX2070

what version of Pytorch to use，询问作者使用的pytroch版本。

run

python -m torch.distributed.launch --nproc_per_node 1 --master_port 1498 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true

an error occurred:

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

this seems to be due to the Pytorch version, would you tell me the version of Pytorch you are using.

这似乎是Pytorch版本导致的，你可以说一下你使用的pytorch版本吗？

complete logs:

833 256823.1875
834 262594.46875
835 269648.78125
836 278420.9375
837 289375.3125
838 301356.34375
839 319525.6875
840 336861.6875
841 366365.5
842 400465.03125
843 446937.90625
844 522279.5
845 637578.5
846 847171.375
847 1310254.0
848 2956609.0
at tensor([[[[0.7883]]]], device='cuda:0')
1it [00:50, 50.43s/it]
1it [00:00,  4.20it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.02it/s]
Traceback (most recent call last):
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/main.py", line 160, in <module>
    main()
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/main.py", line 128, in main
    metrics = trainer.evaluate(
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/trainer/trainer.py", line 1047, in evaluate
    metrics, num_samples = eval_loop(
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/trainer/trainer.py", line 872, in evaluation_loop
    metrics = self.compute_metrics(images,
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/evaluation/multi_task.py", line 65, in evaluate
    summary_tmp = evaluator.evaluate(**eval_kwargs, split=split)
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/evaluation/translate_to_dog.py", line 81, in evaluate
    kid_score = fid.compute_kid(
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/fid.py", line 356, in compute_kid
    feat_model = build_feature_extractor(mode, device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/features.py", line 42, in build_feature_extractor
    feat_model = feature_extractor(name="torchscript_inception", resize_inside=False, device=device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/features.py", line 21, in feature_extractor
    model = InceptionV3W(path, download=True, resize_inside=resize_inside).to(device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/inception_torchscript.py", line 35, in __init__
    self.base = torch.jit.load(path).eval()
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/jit/_serialization.py", line 162, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb:
wandb: Synced translate_afhqwild256_to_afhqdog256_ddim_eta0142: https://wandb.ai/xxcc/graphql/runs/13afq6e4
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20230807_152834-13afq6e4/logs
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2859870) of binary: /ssd/xiedong/miniconda3/envs/generative_prompt/bin/python
Traceback (most recent call last):
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 195, in <module>
    main()
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 191, in main
    launch(args)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 176, in launch
    run(args)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/run.py", line 753, in run
    elastic_launch(
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-08-07_15:29:47
  host      : gpu20
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2859870)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

train the unpaired image-to-image translation on one GPU

Thanks for sharing the great work!

How to train the unpaired image-to-image translation on one GPU?

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_afhqcat256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1446 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Is there any training in Cycle Diffusion

As we are training two diffusion models independently for two related domains, I've been considering whether there are any specific training techniques involved in the Cycle diffusion process?
Is there any loss function or weight update in Cycle Diffusion process?

If such techniques exist, could you please direct me to the section of your paper where they are explained?
Thanks!

main.py: error: the following arguments are required: --output_dir

sorry,I do not know where can i set my parameter

how to test with custom images

error from "import datasets import transformers" in the main.py

Hi ChenWu98,

can you take a look at the error about import dtasets and import transformers when directly running the main.py?

Which part of algorithm 1 stands for cycle, why do you call it cyclediffusion?

Hi, thanks for your answer, I understand you use the pretrained diffusion model,but I'm still confused why you call alogrithm1 as cycle diffusion. Is it similiar with the idea of cycleGan, I can't find the cycle consistent inside. Do we need extra training to hold the cycle beside the pretrained model?
Thanks again!

Can I use the unpaired images of different sizes for image to image translate

I am interested in implementing super resolution using cycle diffusion on images.

My low-resolution images are 64x64, while the high-resolution ones are 512x512.

Although one solution could be to resize the low-resolution images to match the high-resolution ones but doing so will increase the model parameters for the low-resolution images.

Therefore, I am considering using the original image size and wondering if that is possible.

Thanks!

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0

Thank you for your excellent work. I have encountered some problems. Can you tell me how to correct them？

several researchers and practitioners have found that when trained with the same “random seed” leads to similar images (Nichol et al., 2022)

Dear @ChenWu98 ,

Given two stochastic DPMs G1 and G2 that model two distributions D1 and D2, several researchers and practitioners have found that sampling with the same “random seed” leads to similar images (Nichol et al., 2022)

For the above claim, would it be possible to point out the corresponding results in this paper? https://arxiv.org/pdf/2112.10741.pdf

It seems that all the compared models are trained on the same domain in the GLIDE paper.

no module named clean_clip

Does the unpaired image-to-image translation works with Hugging Face Stable Diffusion checkpoints?

Hi! I am trying to run unpaired image-to-image translation on a custom dataset.

I am using Hugging Face Unconditional Image Generation Pipeline to train a DDPM model on both domains.

How can I use the saved checkpoints, which are stored as safetensors for the unpaired I2I translation?

How to write config.yaml for custom datasets

Thanks for sharing your work.

I want to train my customed unpaired image to image tasks. And I found this tutorial . But I have a question about how to write customed yaml like afhq.yaml:

data:
    dataset: "AFHQ"
    category: "dog"
    image_size: 256
    channels: 3
    logit_transform: false
    uniform_dequantization: false
    gaussian_dequantization: false
    random_flip: true
    rescaled: true
    num_workers: 0

diffusion:
    beta_schedule: linear
    beta_start: 0.0001
    beta_end: 0.02
    num_diffusion_timesteps: 1000

Is it automatically generated or can I customize it?

Looking for your replying. Thanks.

Custom dataset load Issue

hello. I run Cycle-diffusion on a custom dataset, but there seems to be a problem with specifying the data.
I modified the "/data/translate-text.json" as you said. also, translate_text512.cfg in the "config" folder was also modified.

The code below is the modified "/config/tasks/translate_text512.cfg"

[raw_data]
data_program = ./raw_data/empty.py
data_cache_dir = ./data/dataset/
use_cache = True

[preprocess]
preprocess_program = translate_text512
expansion = 1

[evaluation]
evaluator_program = translate_text

Q1. It looks like I need to modify the data_program variable to suit me, is that correct?

However, the following error occurs due to incorrect modification.
Below is the text of the error that occurred

Q2. Please see the error below and let me know if you have a solution.

Rank 0 Trainer build successfully.
INFO:main:*** Evaluate ***
INFO:trainer.trainer:***** Running eval *****
INFO:trainer.trainer: Num examples = 0
INFO:trainer.trainer: Batch size = 1
0it [00:00, ?it/s] 0it [00:00, ?it/s]
Traceback (most recent call last):
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/main.py", line 153, in
main()
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/main.py", line 127, in main
metrics = trainer.evaluate(
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/trainer/trainer.py", line 1047, in evaluate
metrics, num_samples = eval_loop(
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/trainer/trainer.py", line 867, in evaluation_loop
images, weighted_loss, losses = all_prediction_outputs
TypeError: cannot unpack non-iterable NoneType object

Please tell train the image to image translation with custom dataset

I find that the training part code seems incomplete,If we want to train custom data set with your code, what should be done. I also can't find your cycle consistent code, is that not complete yet? Thanks for sharing your code but please explain better.

Thanks in advance

len(dataset_splits['dev'])=0

When I checked the data load, I noticed len(dataset_splits['dev'])=0. It doesn't look like the data is loaded. Why is that?
Where else should I set the path to load the data.

colab to reproduce fig2

Hi @ChenWu98 ,

Thanks for sharing the awesome work.

Would it be possible to share a jupyter-notebook to reproduce the results in fig2?

Time consuming, so long

Hi, Chen, thanks for you great work.
I have tried to run zero-shot image-to-image translation with Stable Diffusion v1-4 with A100 single GPU, it costs 14.5 hours to finish the task, does this expected? Maybe the time consuming is too long.

export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1405 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Unpaired image-to-image translation

Hi~
I use translate_afhqwild256_to_afhqdog256_ddim_eta01.cfg to excute Unpaired image-to-image translation with diffusion models trained on two domains. I have some problems:
1.I use sample_type = ddim, but I cannot understand what "custom_steps = 1000, refine_steps = 100, es_steps = 850" stand for?
2.Is the denoising steps when training the two models in the source and target domains equal to the customer_steps here?
3.What should I change if I want to use fewer diffusion steps in this code like ddim? I can't find parameters similar to "--timestep_respacing ddim250" in this code.

Thanks a lot!

Questions with Plug-and-play Guidance

Hello, thanks for the great work.
I had some questions with plug-and-play guidance.

(1) In section 3.4, the latent code is updated with langevin dynamics, which includes term for scores for the latent code, z.
I am curious of how to obtain the scores for z.

(2) Also, for me it seems like plug-and-play guidance is exactly the same as classifier guidance if the energy term (CLIP or Face Recognition case) is taking target image + some level of noise as input. Could you explain the difference between the two?

Thank you.

code example for plug-and-play guidance for diffusion models

Thanks for this amazing work.

I wonder where I can find any code example for section 4.3 (plug-and-play guidance for diffusion models) with e.g., stable diffusion? I believe it is not included in the cycle_diffusion pipeline?

Thanks!

About outputs issue

Hi, I trained “translate cat-to-dog” , and generated a folder called “translate_afhqcat256_to_afhqdog256_ddim_eta0142” in the output folder, but after training for a long time, no images were saved in it. When will the images be saved? Another question is what does “self.output_interval” in “trainer.py” control?

No outputs: Unpaired image-to-image translation with diffusion models trained on two domains

Duplicate of #22

When running the above model, the output folder (translate_afhqcat256_to_afhqdog256_ddim_eta0142) is created, and the model starts to train, but no images (or anything else) is being saved inside that folder. I have been training for several hours, for 150+ iterations. Using all the default parameters on the repo's readme.

Encoding two images to obtain third image

Hi thanks for the wonderful work.
Can we interpolate two image latents and feed the modified latent to generator obtain a third image like in DDIM model??

TypeError: cannot unpack non-iterable NoneType object

[Customized use for zero-shot image-to-image translation] I Follow this step to reproduce
An error occurred as follows:
Traceback (most recent call last):
File "main.py", line 150, in
main()
File "main.py", ine 124, in main
metrics = trainer.evaluate(
File "/home/zhangzhang/zz/cycle-diffusion-main/trainer/trainer.py", line 1048, in evaluate
metrics,num_samples = eval_loop(
File "/home/zhangzhang/zz/cycle-diffusion-main/trainer/trainer.py", line 867,in evaluation_loop
images， weighted_loss, losses= all_prediction_outputs
TypeError: cannot unpack non-iterable NoneType object

all_prediction_outputs printed return value is None

Can you tell me how to solve it? Thank you very much!

Could you please explain what "custom step" and "white_box_step" stand for?

Hi Chen,

Sorry for bothering you again, you set "custom step=99" and "white_box_step=100" for stable diffusion. Could you please explain what are "custom step" and "white_box_step"? It will be very helpful for me to understand the code.

Thx

Question about LDM support

Hello and thanks for sharing your code.

I want to use your implementation with LDM models. I have already trained two LDMs on two separate (but related) domains. How should I proceed so as to perform Image translation between these two domains using my pretrained LDM models.

Thanks in advance.

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 613558) of binary:

During Unpaired image-to-image translation with diffusion models trained on two domains task, I have error code as below. How can I fix it?

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 613558) of binary: /home/*****/anaconda3/envs/joohoon_cd/bin/python
Traceback (most recent call last):
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 196, in <module>
    main()
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 192, in main
    launch(args)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 177, in launch
    run(args)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
main.py FAILED

Thanks