Giter Club home page Giter Club logo

cycle-diffusion's Introduction

CycleDiffusion



Official PyTorch implementation of our paper:
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
Chen Henry Wu, Fernando De la Torre
Carnegie Mellon University
Preprint, Oct 2022

A modified version of this paper is accepted to ICCV 2023:
A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance
Chen Henry Wu, Fernando De la Torre
Carnegie Mellon University
ICCV 2023

[Paper link] | [ICCV version] | [Diffusers ๐Ÿงจ implementation] | [HuggingFace ๐Ÿค— demo]

Updates

[Oct 13 2022] Code released. Section 4.3 of the earliest ArXiv version is open-sourced at Unified Generative Zoo.

[Nov 9 2022] CycleDiffusion is now available as a pipeline on HuggingFace ๐Ÿค— Diffusers ๐Ÿงจ. Please check the pipeline doc.

[Nov 10 2022] A demo built with HuggingFace ๐Ÿค— Spaces is available at Stable CycleDiffusion.

Overview

We think the randomness in diffusion models is like magic! Accumulated evidence has shown that fixing the "random seed" helps diffusion models generate images from two image distributions with minimal differences. Our paper is exactly about how to formalize this "random seed" and how to infer it from a given real image.

Our formalization and derivation are purely by definition, while we show that some amazing consequences follow! This repository contains code for CycleDiffusion, an embarrassingly simple method capable of

  1. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
  2. Traditional unpaired image-to-image translation with diffusion models trained on two related domains.

Check our results on zero-shot image-to-image translation below! We formulate the task input as a triplet $(\boldsymbol{x}, \boldsymbol{t}, \hat{\boldsymbol{t}})$:

  1. $\boldsymbol{x}$ is the source image, displayed with a purple margin.
  2. $\boldsymbol{t}$ is the source text, with text spans marked in purple.
  3. $\hat{\boldsymbol{t}}$ is the target text, with text spans abbreviated as $[\ldots]$ if overlapped with the source text.

We used Stable Diffusion in our experiments. Notably, all source images $\boldsymbol{x}$ are real images! Yes, you find that some of them are generated by DALLโˆ™E 2, but these images can be seen as real for Stable Diffusion :)


Here are some comparisons with baselines.

Contents

Dependencies

  1. Create an environment by running
conda env create -f environment.yml
conda activate generative_prompt
pip install git+https://github.com/openai/CLIP.git
  1. Install torch and torchvision based on your CUDA version.
  2. Install taming-transformers by running
cd ../
git clone [email protected]:CompVis/taming-transformers.git
cd taming-transformers/
pip install -e .
cd ../
  1. Set up wandb for logging (registration is required). You should modify the setup_wandb function in main.py to accomodate your wandb credentials. You may want to run something like
wandb login

Evaluation data

  1. Most data for zero-shot image-to-image translation are already included in data/. Some images are from the AFHQ validation set, detailed below.
  2. Prepare the AFHQ validation set for unpaired image-to-image translation (also for some images used by zero-shot image-to-image translation) by running
git clone [email protected]:clovaai/stargan-v2.git
cd stargan-v2/
bash download.sh afhq-v2-dataset

Pre-trained diffusion models

  1. Stable Diffusion
cd ckpts/
mkdir stable_diffusion
cd stable_diffusion/
# Download pre-trained checkpoints for Stable Diffusion here.
# You should download this version: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original
# Due to licence issues, we cannot share the pre-trained checkpoints directly.
  1. Latent Diffusion Model
cd ckpts/
wget https://www.dropbox.com/s/9lpdgs83l7tjk6c/ldm_models.zip
unzip ldm_models.zip
cd ldm_models/
mkdir text2img-large
cd text2img-large/
wget https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
wget https://www.dropbox.com/s/7pdttimz78ll0km/txt2img-1p4B-eval.yaml
  1. DDPM (AFHQ-Dog and FFHQ are from ILVR; CelebAHQ is from SDEdit; AFHQ-Cat and -Wild are trained by ourselves)
cd ckpts/
mkdir ddpm
cd ddpm/
# Update Aug 4, 2023: it seems that the link below, originally from SDEdit, is broken. Please find other sources for CelebA-HQ (cf. issue #24)
wget https://image-editing-test-12345.s3-us-west-2.amazonaws.com/checkpoints/celeba_hq.ckpt
wget https://www.dropbox.com/s/g4h8sv07i3hj83d/ffhq_10m.pt
wget https://www.dropbox.com/s/u74w8vaw1f8lc4k/afhq_dog_4m.pt
wget https://www.dropbox.com/s/8i5aznjwdl3b5iq/cat_ema_0.9999_050000.pt
wget https://www.dropbox.com/s/tplximipy8zxaub/wild_ema_0.9999_050000.pt
wget https://www.dropbox.com/s/vqm6bxj0zslrjxv/configs.zip
unzip configs.zip

Usage

Zero-shot image-to-image translation with text-to-image diffusion models

  1. Zero-shot image-to-image translation with Stable Diffusion v1-4. We divided the 128 test samples into 8 groups (16 samples in each group), so the averaged metrics are reported.
export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1405 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_2
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1424 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=2
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_3
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1423 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=3
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_4
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1422 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=4
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_5
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1429 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_6
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1428 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=6
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_7
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1427 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=7
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_8
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1426 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &
  1. Zero-shot image-to-image translation with the LDM text-to-image checkpoint. We divided the 128 test samples into 8 groups, so the averaged metrics are reported.
export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_latentdiff_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1465 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_text2img256_latentdiff_stochastic_2
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1485 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=2
export RUN_NAME=translate_text2img256_latentdiff_stochastic_3
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1486 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=3
export RUN_NAME=translate_text2img256_latentdiff_stochastic_4
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1487 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=4
export RUN_NAME=translate_text2img256_latentdiff_stochastic_5
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1488 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_text2img256_latentdiff_stochastic_6
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1489 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=6
export RUN_NAME=translate_text2img256_latentdiff_stochastic_7
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1411 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

export CUDA_VISIBLE_DEVICES=7
export RUN_NAME=translate_text2img256_latentdiff_stochastic_8
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1412 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 16 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Customized use for zero-shot image-to-image translation

  1. Add your own image path and source-target text pairs at the end of this json file. You can add as many as you want.
  2. Suggested hyperparameter tuning in this config file
  • decoder_unconditional_guidance_scales: larger value means more weight on the target text
  • skip_steps: larger value means more similar to the original image
  • random seed: different random seed will generate different results
  1. Note that each combination of decoder_unconditional_guidance_scales $\times$ skip_steps will be enumerated, and the best one is returned.
  2. Run the following command to generate the image. Outputs will be saved in the output folder.
export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_custom
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1426 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Unpaired image-to-image translation with diffusion models trained on two domains

  1. AFHQ-Cat to AFHQ-Dog with DDIM $\eta=0.1$
export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_afhqcat256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1446 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &
  1. AFHQ-Wild to AFHQ-Dog with DDIM $\eta=0.1$
export CUDA_VISIBLE_DEVICES=5
export RUN_NAME=translate_afhqwild256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1498 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Citation

If you find this repository helpful, please cite it as

@inproceedings{cyclediffusion,
  title={Unifying Diffusion Models' Latent Space, with Applications to {CycleDiffusion} and Guidance},
  author={Chen Henry Wu and Fernando De la Torre},
  booktitle={ArXiv},
  year={2022},
}

or

@inproceedings{cyclediffusion,
  title={A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance},
  author={Chen Henry Wu and Fernando De la Torre},
  booktitle={ICCV},
  year={2023},
}

License

We use the X11 License. This license is identical to the MIT License, but with an extra sentence that prohibits using the copyright holders' names (Carnegie Mellon University in our case) for advertising or promotional purposes without written permission.

Contact

Issues are welcome if you have any questions about the code. If you would like to discuss the method, please contact Chen Henry Wu.

cycle-diffusion's People

Contributors

chenwu98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cycle-diffusion's Issues

ldm_models.zip Corrupt files๏ผŒๆŸๅ็š„ๆ–‡ไปถ

An error occurred while decompressing๏ผš
่งฃๅŽ‹็š„ๆ—ถๅ€™ๅ‡บ็Žฐไบ†้”™่ฏฏ๏ผš

(ldm) home/InST-main/cycle-diffusion-main/ckpts$ unzip ldm_models.zip
Archive: ldm_models.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of ldm_models.zip or
ldm_models.zip.zip, and cannot find ldm_models.zip.ZIP, period.

install envionment issue

C:\Github_Code\GAN\cycle-diffusion>conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

  • pathlib2==2.3.6=py39h06a4308_2
  • cairo==1.16.0=hf32fb01_1
  • spyder-kernels==2.1.3=py39h06a4308_0
  • _ipyw_jlab_nb_ext_conf==0.1.0=py39h06a4308_0
  • yaml==0.2.5=h7b6447c_0
  • intel-openmp==2021.4.0=h06a4308_3561
  • locket==0.2.1=py39h06a4308_1
  • pywavelets==1.1.1=py39h6323ea4_4
  • openssl==1.1.1l=h7f8727e_0
  • conda-package-handling==1.7.3=py39h27cfd23_1
  • libev==4.33=h7f8727e_1
  • pluggy==0.13.1=py39h06a4308_0
  • ca-certificates==2021.10.26=h06a4308_2
  • importlib-metadata==4.8.1=py39h06a4308_0
  • libuv==1.40.0=h7b6447c_0
  • markupsafe==1.1.1=py39h27cfd23_0
  • statsmodels==0.12.2=py39h27cfd23_0
  • ld_impl_linux-64==2.35.1=h7274673_9
  • zope==1.0=py39h06a4308_1
  • xz==5.2.5=h7b6447c_0
  • pep8==1.7.1=py39h06a4308_0
  • mpfr==4.0.2=hb69a4c5_1
  • pyjwt==2.1.0=py39h06a4308_0
  • unixodbc==2.3.9=h7b6447c_0
  • fontconfig==2.13.1=h6c09931_0
  • widgetsnbextension==3.5.1=py39h06a4308_0
  • pytest==6.2.4=py39h06a4308_2
  • gstreamer==1.14.0=h28cd5cc_2
  • mistune==0.8.4=py39h27cfd23_1000
  • conda==4.10.3=py39h06a4308_0
  • scipy==1.7.1=py39h292c36d_2
  • sqlite==3.36.0=hc218d9a_0
  • navigator-updater==0.2.1=py39h06a4308_0
  • astroid==2.6.6=py39h06a4308_0
  • libstdcxx-ng==9.3.0=hd4cf53a_17
  • wurlitzer==2.1.1=py39h06a4308_0
  • pytables==3.6.1=py39h77479fe_1
  • py-lief==0.10.1=py39h2531618_1
  • libcurl==7.78.0=h0b77cf5_0
  • argon2-cffi==20.1.0=py39h27cfd23_1
  • cryptography==3.4.8=py39hd23ed53_0
  • typed-ast==1.4.3=py39h7f8727e_1
  • libxml2==2.9.12=h03d6c58_0
  • secretstorage==3.3.1=py39h06a4308_0
  • certifi==2021.10.8=py39h06a4308_0
  • zlib==1.2.11=h7b6447c_3
  • lerc==3.0=h295c915_0
  • chardet==4.0.0=py39h06a4308_1003
  • terminado==0.9.4=py39h06a4308_0
  • tbb4py==2021.4.0=py39hd09550d_0
  • setuptools==58.0.4=py39h06a4308_0
  • python==3.9.7=h12debd9_1
  • expat==2.4.1=h2531618_2
  • debugpy==1.4.1=py39h295c915_0
  • graphite2==1.3.14=h23475e2_0
  • kiwisolver==1.3.1=py39h2531618_0
  • boto==2.49.0=py39h06a4308_0
  • mpich==3.3.2=hc856adb_0
  • zfp==0.5.5=h2531618_6
  • krb5==1.19.2=hac12032_0
  • libtool==2.4.6=h7b6447c_1005
  • regex==2021.8.3=py39h7f8727e_0
  • jedi==0.18.0=py39h06a4308_1
  • sphinxcontrib==1.0=py39h06a4308_1
  • scikit-image==0.18.3=py39h51133e4_0
  • mkl_fft==1.3.1=py39hd3c417c_0
  • bzip2==1.0.8=h7b6447c_0
  • cycler==0.10.0=py39h06a4308_0
  • webencodings==0.5.1=py39h06a4308_1
  • arrow==0.13.1=py39h06a4308_0
  • ply==3.11=py39h06a4308_0
  • libarchive==3.4.2=h62408e4_0
  • bottleneck==1.3.2=py39hdd57654_1
  • mkl_random==1.2.2=py39h51133e4_0
  • msgpack-python==1.0.2=py39hff7bd54_1
  • wrapt==1.12.1=py39he8ac12f_1
  • giflib==5.2.1=h7b6447c_0
  • pyqt==5.9.2=py39h2531618_6
  • libpng==1.6.37=hbc83047_0
  • pyyaml==6.0=py39h7f8727e_1
  • zstd==1.4.9=haebb681_0
  • cffi==1.14.6=py39h400218f_0
  • libgfortran-ng==7.5.0=ha8ba4b0_17
  • pandas==1.3.4=py39h8c16a72_0
  • pyzmq==22.2.1=py39h295c915_1
  • charls==2.2.0=h2531618_0
  • libaec==1.0.4=he6710b0_1
  • pandocfilters==1.4.3=py39h06a4308_1
  • brotli==1.0.9=he6710b0_2
  • xlwt==1.3.0=py39h06a4308_0
  • zeromq==4.3.4=h2531618_0
  • anyio==2.2.0=py39h06a4308_1
  • libwebp==1.2.0=h89dd481_0
  • docutils==0.17.1=py39h06a4308_1
  • jxrlib==1.1=h7b6447c_2
  • libnghttp2==1.41.0=hf8bcb03_2
  • daal4py==2021.3.0=py39hae6d005_0
  • mypy_extensions==0.4.3=py39h06a4308_0
  • greenlet==1.1.1=py39h295c915_0
  • sniffio==1.2.0=py39h06a4308_1
  • matplotlib==3.4.3=py39h06a4308_0
  • pylint==2.9.6=py39h06a4308_1
  • ujson==4.0.2=py39h2531618_0
  • pyrsistent==0.18.0=py39heee7806_0
  • clyent==1.2.2=py39h06a4308_1
  • libssh2==1.9.0=h1ba5d50_1
  • tbb==2021.4.0=hd09550d_0
  • openjpeg==2.4.0=h3ad879b_0
  • mpmath==1.2.1=py39h06a4308_0
  • tornado==6.1=py39h27cfd23_0
  • gmpy2==2.0.8=py39h8083e48_3
  • watchdog==2.1.3=py39h06a4308_0
  • patchelf==0.13=h295c915_0
  • pycosat==0.6.3=py39h27cfd23_0
  • keyring==23.1.0=py39h06a4308_0
  • ncurses==6.3=heee7806_1
  • dbus==1.13.18=hb2f20db_0
  • jupyter==1.0.0=py39h06a4308_7
  • readline==8.1=h27cfd23_0
  • pip==21.2.4=py39h06a4308_0
  • spyder==5.1.5=py39h06a4308_1
  • numpy-base==1.20.3=py39h74d4b33_0
  • icu==58.2=he6710b0_3
  • ipykernel==6.4.1=py39h06a4308_1
  • psutil==5.8.0=py39h27cfd23_1
  • jupyter_server==1.4.1=py39h06a4308_0
  • sip==4.19.13=py39h2531618_0
  • libzopfli==1.0.3=he6710b0_0
  • qt==5.9.7=h5867ecd_1
  • libgfortran4==7.5.0=ha8ba4b0_17
  • blosc==1.21.0=h8c45485_0
  • pyerfa==2.0.0=py39h27cfd23_0
  • matplotlib-base==3.4.3=py39hbbc1b5f_0
  • cfitsio==3.470=hf0d0db6_6
  • llvmlite==0.37.0=py39h295c915_1
  • libffi==3.3=he6710b0_2
  • fastcache==1.1.0=py39he8ac12f_0
  • bitarray==2.3.0=py39h7f8727e_1
  • snappy==1.1.8=he6710b0_0
  • freetype==2.10.4=h5ab3b9f_0
  • jbig==2.1=hdba287a_0
  • libxslt==1.1.34=hc22bd24_0
  • libtiff==4.2.0=h85742a9_0
  • libxcb==1.14=h7b6447c_0
  • multipledispatch==0.6.0=py39h06a4308_0
  • brotlipy==0.7.0=py39h27cfd23_1003
  • glib==2.69.1=h5202010_0
  • unicodecsv==0.14.1=py39h06a4308_0
  • brunsli==0.1=h2531618_0
  • ipython==7.29.0=py39hb070fc8_0
  • tk==8.6.11=h1ccaba5_0
  • nbconvert==6.1.0=py39h06a4308_0
  • pycurl==7.44.1=py39h8f2d780_1
  • future==0.18.2=py39h06a4308_1
  • mkl==2021.4.0=h06a4308_640
  • lcms2==2.12=h3be6417_0
  • h5py==3.3.0=py39h930cdd6_0
  • inflection==0.5.1=py39h06a4308_0
  • get_terminal_size==1.0.0=haa9412d_0
  • imagecodecs==2021.8.26=py39h4cda21f_0
  • zope.interface==5.4.0=py39h7f8727e_0
  • numpy==1.20.3=py39hf144106_0
  • ruamel_yaml==0.15.100=py39h27cfd23_0
  • numba==0.54.1=py39h51133e4_0
  • curl==7.78.0=h1ccaba5_0
  • lxml==4.6.3=py39h9120a33_0
  • libdeflate==1.8=h7f8727e_5
  • libedit==3.1.20210910=h7f8727e_0
  • anaconda-client==1.9.0=py39h06a4308_0
  • harfbuzz==2.8.1=h6f93f22_0
  • path==16.0.0=py39h06a4308_0
  • cython==0.29.24=py39hdbfa776_0
  • lz4-c==1.9.3=h295c915_1
  • sympy==1.9=py39h06a4308_0
  • bkcharts==0.2=py39h06a4308_0
  • argh==0.26.2=py39h06a4308_0
  • mpi==1.0=mpich
  • numexpr==2.7.3=py39h22e1b3c_1
  • hdf5==1.10.6=hb1b8bf9_0
  • pillow==8.4.0=py39h5aabda8_0
  • scikit-learn==0.24.2=py39ha9443f7_0
  • cytoolz==0.11.0=py39h27cfd23_0
  • mkl-service==2.4.0=py39h7f8727e_0
  • pysocks==1.7.1=py39h06a4308_0
  • libgcc-ng==9.3.0=h5101ec6_17
  • distributed==2021.10.0=py39h06a4308_0
  • et_xmlfile==1.1.0=py39h06a4308_0
  • gmp==6.2.1=h2531618_2
  • gst-plugins-base==1.14.0=h8213a91_2
  • liblief==0.10.1=h2531618_1
  • notebook==6.4.5=py39h06a4308_0
  • libsodium==1.0.18=h7b6447c_0
  • dal==2021.3.0=h06a4308_557
  • libspatialindex==1.9.3=h2531618_0
  • libuuid==1.0.3=h7f8727e_2
  • zope.event==4.5.0=py39h06a4308_0
  • sqlalchemy==1.4.22=py39h7f8727e_0
  • fribidi==1.0.10=h7b6447c_0
  • pixman==0.40.0=h7f8727e_1
  • mpc==1.1.0=h10f8cd9_1
  • astropy==4.3.1=py39h09021b7_0
  • mccabe==0.6.1=py39h06a4308_1
  • entrypoints==0.3=py39h06a4308_0
  • lazy-object-proxy==1.6.0=py39h27cfd23_0
  • pyodbc==4.0.31=py39h295c915_0
  • bokeh==2.4.1=py39h06a4308_0
  • conda-build==3.21.5=py39h06a4308_0
  • pcre==8.45=h295c915_0
  • libgomp==9.3.0=h5101ec6_17
  • libllvm11==11.1.0=h3826bc1_0
  • gevent==21.8.0=py39h7f8727e_1
  • simplegeneric==0.8.1=py39h06a4308_2
  • scikit-learn-intelex==2021.3.0=py39h06a4308_0
  • jupyter_core==4.8.1=py39h06a4308_0
  • jpeg==9d=h7f8727e_0
  • pkginfo==1.7.1=py39h06a4308_0
  • patsy==0.5.2=py39h06a4308_0
  • rtree==0.9.7=py39h06a4308_1
  • libwebp-base==1.2.0=h27cfd23_0
  • _openmp_mutex==4.5=1_gnu
  • pango==1.45.3=hd140c19_0
  • c-ares==1.17.1=h27cfd23_0
  • lzo==2.10=h7b6447c_2
    C:\Github_Code\GAN\cycle-diffusion>

my anaconada version is conda 23.7.4
OS: window 11
GPU: RTX2070

what version of Pytorch to use๏ผŒ่ฏข้—ฎไฝœ่€…ไฝฟ็”จ็š„pytroch็‰ˆๆœฌใ€‚

run

python -m torch.distributed.launch --nproc_per_node 1 --master_port 1498 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true

an error occurred:

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

this seems to be due to the Pytorch version, would you tell me the version of Pytorch you are using.

่ฟ™ไผผไนŽๆ˜ฏPytorch็‰ˆๆœฌๅฏผ่‡ด็š„๏ผŒไฝ ๅฏไปฅ่ฏดไธ€ไธ‹ไฝ ไฝฟ็”จ็š„pytorch็‰ˆๆœฌๅ—๏ผŸ

complete logs:

833 256823.1875
834 262594.46875
835 269648.78125
836 278420.9375
837 289375.3125
838 301356.34375
839 319525.6875
840 336861.6875
841 366365.5
842 400465.03125
843 446937.90625
844 522279.5
845 637578.5
846 847171.375
847 1310254.0
848 2956609.0
at tensor([[[[0.7883]]]], device='cuda:0')
1it [00:50, 50.43s/it]
1it [00:00,  4.20it/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1/1 [00:00<00:00,  1.02it/s]
Traceback (most recent call last):
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/main.py", line 160, in <module>
    main()
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/main.py", line 128, in main
    metrics = trainer.evaluate(
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/trainer/trainer.py", line 1047, in evaluate
    metrics, num_samples = eval_loop(
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/trainer/trainer.py", line 872, in evaluation_loop
    metrics = self.compute_metrics(images,
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/evaluation/multi_task.py", line 65, in evaluate
    summary_tmp = evaluator.evaluate(**eval_kwargs, split=split)
  File "/ssd/xiedong/home/InST-main/cycle-diffusion-main/evaluation/translate_to_dog.py", line 81, in evaluate
    kid_score = fid.compute_kid(
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/fid.py", line 356, in compute_kid
    feat_model = build_feature_extractor(mode, device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/features.py", line 42, in build_feature_extractor
    feat_model = feature_extractor(name="torchscript_inception", resize_inside=False, device=device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/features.py", line 21, in feature_extractor
    model = InceptionV3W(path, download=True, resize_inside=resize_inside).to(device)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/cleanfid/inception_torchscript.py", line 35, in __init__
    self.base = torch.jit.load(path).eval()
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/jit/_serialization.py", line 162, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb:
wandb: Synced translate_afhqwild256_to_afhqdog256_ddim_eta0142: https://wandb.ai/xxcc/graphql/runs/13afq6e4
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20230807_152834-13afq6e4/logs
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2859870) of binary: /ssd/xiedong/miniconda3/envs/generative_prompt/bin/python
Traceback (most recent call last):
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 195, in <module>
    main()
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 191, in main
    launch(args)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 176, in launch
    run(args)
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/run.py", line 753, in run
    elastic_launch(
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/ssd/xiedong/miniconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-08-07_15:29:47
  host      : gpu20
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2859870)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
  

train the unpaired image-to-image translation on one GPU

Thanks for sharing the great work!

How to train the unpaired image-to-image translation on one GPU?

export CUDA_VISIBLE_DEVICES=1
export RUN_NAME=translate_afhqcat256_to_afhqdog256_ddim_eta01
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1446 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Is there any training in Cycle Diffusion

As we are training two diffusion models independently for two related domains, I've been considering whether there are any specific training techniques involved in the Cycle diffusion process?
Is there any loss function or weight update in Cycle Diffusion process?

If such techniques exist, could you please direct me to the section of your paper where they are explained?
Thanks!

Which part of algorithm 1 stands for cycle, why do you call it cyclediffusion?

Hi, thanks for your answer, I understand you use the pretrained diffusion model,but I'm still confused why you call alogrithm1 as cycle diffusion. Is it similiar with the idea of cycleGan, I can't find the cycle consistent inside. Do we need extra training to hold the cycle beside the pretrained model?
Thanks again!

Can I use the unpaired images of different sizes for image to image translate

I am interested in implementing super resolution using cycle diffusion on images.

My low-resolution images are 64x64, while the high-resolution ones are 512x512.

Although one solution could be to resize the low-resolution images to match the high-resolution ones but doing so will increase the model parameters for the low-resolution images.

Therefore, I am considering using the original image size and wondering if that is possible.

Thanks!

several researchers and practitioners have found that when trained with the same โ€œrandom seedโ€ leads to similar images (Nichol et al., 2022)

Dear @ChenWu98 ,

Given two stochastic DPMs G1 and G2 that model two distributions D1 and D2, several researchers and practitioners have found that sampling with the same โ€œrandom seedโ€ leads to similar images (Nichol et al., 2022)

For the above claim, would it be possible to point out the corresponding results in this paper? https://arxiv.org/pdf/2112.10741.pdf

It seems that all the compared models are trained on the same domain in the GLIDE paper.

How to write config.yaml for custom datasets

Thanks for sharing your work.

I want to train my customed unpaired image to image tasks. And I found this tutorial . But I have a question about how to write customed yaml like afhq.yaml:

data:
    dataset: "AFHQ"
    category: "dog"
    image_size: 256
    channels: 3
    logit_transform: false
    uniform_dequantization: false
    gaussian_dequantization: false
    random_flip: true
    rescaled: true
    num_workers: 0

diffusion:
    beta_schedule: linear
    beta_start: 0.0001
    beta_end: 0.02
    num_diffusion_timesteps: 1000

Is it automatically generated or can I customize it?

Looking for your replying. Thanks.

Custom dataset load Issue

hello. I run Cycle-diffusion on a custom dataset, but there seems to be a problem with specifying the data.
I modified the "/data/translate-text.json" as you said. also, translate_text512.cfg in the "config" folder was also modified.

The code below is the modified "/config/tasks/translate_text512.cfg"


[raw_data]
data_program = ./raw_data/empty.py
data_cache_dir = ./data/dataset/
use_cache = True

[preprocess]
preprocess_program = translate_text512
expansion = 1

[evaluation]
evaluator_program = translate_text


Q1. It looks like I need to modify the data_program variable to suit me, is that correct?

However, the following error occurs due to incorrect modification.
Below is the text of the error that occurred

Q2. Please see the error below and let me know if you have a solution.


Rank 0 Trainer build successfully.
INFO:main:*** Evaluate ***
INFO:trainer.trainer:***** Running eval *****
INFO:trainer.trainer: Num examples = 0
INFO:trainer.trainer: Batch size = 1
0it [00:00, ?it/s] 0it [00:00, ?it/s]
Traceback (most recent call last):
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/main.py", line 153, in
main()
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/main.py", line 127, in main
metrics = trainer.evaluate(
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/trainer/trainer.py", line 1047, in evaluate
metrics, num_samples = eval_loop(
File "/mnt/hdd0-4tb/home/cyclediffusion/cycle-diffusion-main/trainer/trainer.py", line 867, in evaluation_loop
images, weighted_loss, losses = all_prediction_outputs
TypeError: cannot unpack non-iterable NoneType object

Please tell train the image to image translation with custom dataset

I find that the training part code seems incomplete,If we want to train custom data set with your code, what should be done. I also can't find your cycle consistent code, is that not complete yet? Thanks for sharing your code but please explain better.

Thanks in advance

len(dataset_splits['dev'])=0

When I checked the data load, I noticed len(dataset_splits['dev'])=0. It doesn't look like the data is loaded. Why is that?
Where else should I set the path to load the data.

colab to reproduce fig2

Hi @ChenWu98 ,

Thanks for sharing the awesome work.

Would it be possible to share a jupyter-notebook to reproduce the results in fig2?

Time consuming, so long

Hi, Chen, thanks for you great work.
I have tried to run zero-shot image-to-image translation with Stable Diffusion v1-4 with A100 single GPU, it costs 14.5 hours to finish the task, does this expected? Maybe the time consuming is too long.

export CUDA_VISIBLE_DEVICES=0
export RUN_NAME=translate_text2img256_stable_diffusion_stochastic_1
export SEED=42
nohup python -m torch.distributed.launch --nproc_per_node 1 --master_port 1405 main.py --seed $SEED --cfg experiments/$RUN_NAME.cfg --run_name $RUN_NAME$SEED --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 50 --metric_for_best_model CLIPEnergy --greater_is_better false --save_strategy steps --save_steps 50 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 4 --num_train_epochs 0 --adafactor false --learning_rate 1e-3 --do_eval --output_dir output/$RUN_NAME$SEED --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --eval_accumulation_steps 4 --ddp_find_unused_parameters true --verbose true > $RUN_NAME$SEED.log 2>&1 &

Unpaired image-to-image translation

Hi~
I use translate_afhqwild256_to_afhqdog256_ddim_eta01.cfg to excute Unpaired image-to-image translation with diffusion models trained on two domains. I have some problems:
1.I use sample_type = ddim, but I cannot understand what "custom_steps = 1000, refine_steps = 100, es_steps = 850" stand for?
2.Is the denoising steps when training the two models in the source and target domains equal to the customer_steps here?
3.What should I change if I want to use fewer diffusion steps in this code like ddim? I can't find parameters similar to "--timestep_respacing ddim250" in this code.

Thanks a lot!

Questions with Plug-and-play Guidance

Hello, thanks for the great work.
I had some questions with plug-and-play guidance.

(1) In section 3.4, the latent code is updated with langevin dynamics, which includes term for scores for the latent code, z.
I am curious of how to obtain the scores for z.

(2) Also, for me it seems like plug-and-play guidance is exactly the same as classifier guidance if the energy term (CLIP or Face Recognition case) is taking target image + some level of noise as input. Could you explain the difference between the two?

Thank you.

About outputs issue

Hi, I trained โ€œtranslate cat-to-dogโ€ , and generated a folder called โ€œtranslate_afhqcat256_to_afhqdog256_ddim_eta0142โ€ in the output folder, but after training for a long time, no images were saved in it. When will the images be saved? Another question is what does โ€œself.output_intervalโ€ in โ€œtrainer.pyโ€ control?

TypeError: cannot unpack non-iterable NoneType object

[Customized use for zero-shot image-to-image translation] I Follow this step to reproduce
An error occurred as follows:
Traceback (most recent call last):
File "main.py", line 150, in
main()
File "main.py", ine 124, in main
metrics = trainer.evaluate(
File "/home/zhangzhang/zz/cycle-diffusion-main/trainer/trainer.py", line 1048, in evaluate
metrics,num_samples = eval_loop(
File "/home/zhangzhang/zz/cycle-diffusion-main/trainer/trainer.py", line 867,in evaluation_loop
images๏ผŒ weighted_loss, losses= all_prediction_outputs
TypeError: cannot unpack non-iterable NoneType object

all_prediction_outputs printed return value is None

Can you tell me how to solve it? Thank you very much!

Question about LDM support

Hello and thanks for sharing your code.

I want to use your implementation with LDM models. I have already trained two LDMs on two separate (but related) domains. How should I proceed so as to perform Image translation between these two domains using my pretrained LDM models.

Thanks in advance.

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 613558) of binary:

During Unpaired image-to-image translation with diffusion models trained on two domains task, I have error code as below. How can I fix it?

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 613558) of binary: /home/*****/anaconda3/envs/joohoon_cd/bin/python
Traceback (most recent call last):
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 196, in <module>
    main()
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 192, in main
    launch(args)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launch.py", line 177, in launch
    run(args)
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/*****/anaconda3/envs/generative_prompt/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
main.py FAILED

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.