Giter Club home page Giter Club logo

piti's Introduction

PITI: Pretraining is All You Need for Image-to-Image Translation

Official PyTorch implementation

Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen
2022

paper | project website | video | online demo

Introduction

We present a simple and universal framework that brings the power of the pretraining to various image-to-image translation tasks. You may try our online demo if interested.

Diverse samples synthesized by our approach.

Set up

Installation

git clone https://github.com/PITI-Synthesis/PITI.git
cd PITI

Environment

conda env create -f environment.yml

Quick Start

Pretrained Models

Please download our pre-trained models for both Base model and Upsample model, and put them in ./ckpt.

Model Task Dataset
Base-64x64 Mask-to-Image Trained on COCO.
Upsample-64-256 Mask-to-Image Trained on COCO.
Base-64x64 Sketch-to-Image Trained on COCO.
Upsample-64-256 Sketch-to-Image Trained on COCO.

If you fail to access to these links, you may alternatively find our pretrained models here.

Prepare Images

We put some example images in ./test_imgs, and you can quickly try them.

COCO

For COCO dataset, download the images and annotations from the COCO webpage.

For mask-to-image synthesis, we use the semantic maps in RGB format as inputs. To obtain such semantic maps, run ./preprocess/preprocess_mask.py (an example of the raw mask and the processed mask is given in preprocess/example). Note that we do not need instant masks like previous works.

For sketch-to-image synthesis, we use sketch maps extracted by HED as inputs. To obtain such sketch maps, run ./preprocess/preprocess_sketch.py.

Inference

Interactive Inference

Run the following script, and it would create an interactive GUI built by gradio. You can upload input masks or sketches and generate images.

pip install gradio
python inference.py

Batch Inference

Modify sample.sh according to the follwing instructions, and run:

bash sample.sh
Args Description
--model_path the path of ckpt for base model.
--sr_model_path the path of ckpt for upsample model.
--val_data_dir the path of a txt file that contains the paths for images.
--num_samples number of images that you want to sample.
--sample_c Strength of classifier-free guidance.
--mode The input type.

Training

Preparation

  1. Download and preprocess datasets. For COCO dataset, download the images and annotations from the COCO webpage. Run ./preprocess/preprocess_mask.py or ./preprocess/preprocess_sketch.py
  2. Download pretrained models by python preprocess/download.py .

Start Training

Taking mask-to-image synthesis as an example: (sketch-to-image is the same)

Finetune the Base Model

Modify mask_finetune_base.sh and run:

bash mask_finetune_base.sh

Finetune the Upsample Model

Modify mask_finetune_upsample.sh and run:

bash mask_finetune_upsample.sh

Citation

If you find this work useful for your research, please cite:

@article{wang2022pretraining,
 title = {Pretraining is All You Need for Image-to-Image Translation},
  author = {Wang, Tengfei and Zhang, Ting and Zhang, Bo and Ouyang, Hao and Chen, Dong and Chen, Qifeng and Wen, Fang},
  journal={arXiv:2205.12952},
  year = {2022},
}

Acknowledgement

Thanks for GLIDE for sharing their code.

piti's People

Contributors

piti-synthesis avatar tengfei-wang avatar zhangmozhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.