Giter Club home page Giter Club logo

global-flow-local-attention's Introduction

Website | ArXiv | Get Start

Global-Flow-Local-Attention

The source code for our paper "Deep Image Spatial Transformation for Person Image Generation" (CVPR2020)

We propose a Global-Flow Local-Attention Model for deep image spatial transformation. Our model can be flexibly applied to tasks such as:

  • Pose-Guided Person Image Generation:

Left: generated results of our model; Right: Input source images.

  • Pose-Guided Person Image Animation

Left most: Skeleton Squences. The others: Animation Results.

  • Face Image Animation

Left: Input image; Right: Output results.

  • View Synthesis

Form Left to Right: Input image, Results of Appearance Flow, Results of Ours, Ground-truth images.

News

  • 2020.4.30 Several demos are provided for quick exploration.

  • 2020.4.29 Code for Pose-Guided Person Image Animation is avaliable now!

  • 2020.3.15 We upload the code and trained models of the Face Animation and View Synthesis!

  • 2020.3.3 Project Website and Paper are avaliable!

  • 2020.2.29 Code for PyTorch is available now!

Colab Demo

For a quick exploration of our model, find the online colab demo.

Get Start

1) Installation

Requirements

  • Python 3
  • pytorch (1.0.0)
  • CUDA
  • visdom

Conda installation

# 1. Create a conda virtual environment.
conda create -n gfla python=3.6 -y
source activate gfla

# 2. Install dependency
pip install -r requirement.txt

# 3. Build pytorch Custom CUDA Extensions
./setup.sh

Note: The current code is tested with Tesla V100. If you use a different GPU, you may need to select correct nvcc_args for your GPU when you buil Custom CUDA Extensions. Comment or Uncomment --gencode in block_extractor/setup.py, local_attn_reshape/setup.py, and resample2d_package/setup.py. Please check here for details.

2) Download Resources

We provide the pre-trained weights of our model. The resources are listed as following:

Download the Per-Trained Models and the Demo Images by running the following code:

./download.sh

3) Pose-Guided Person Image Generation

The Pose-Guided Person Image Generation task is to transfer a source person image to a target pose.

Run the demo of this task:

python demo.py \
--name=pose_fashion_checkpoints \
--model=pose \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=fashion \
--dataroot=./dataset/fashion \
--results_dir=./demo_results/fashion

For more training and testing details, please find the PERSON_IMAGE_GENERATION.md

4) Pose-Guided Person Image Animation

The Pose-Guided Person Image Animation task generates a video clip from a still source image according to a driving target sequence. We further model the temporal consistency for this task.

Run the the demo of this task:

python demo.py \
--name=dance_fashion_checkpoints \
--model=dance \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=dance \
--sub_dataset=fashion \
--dataroot=./dataset/danceFashion \
--results_dir=./demo_results/dance_fashion \
--test_list=val_list.csv

For more training and testing details, please find the PERSON_IMAGE_ANIMATION.md.

5) Face Image Animation

Given an input source image and a guidance video sequence depicting the structure movements, our model generating a video containing the specific movements.

Run the the demo of this task:

python demo.py \
--name=face_checkpoints \
--model=face \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=face \
--dataroot=./dataset/FaceForensics \
--results_dir=./demo_results/face 

We use the real video of the FaceForensics dataset. See FACE_IMAGE_ANIMATION.md for more details.

6) Novel View Synthesis

View synthesis requires generating novel views of objects or scenes based on arbitrary input views.

In this task, we use the car and chair categories of the ShapeNet dataset. See VIEW_SYNTHESIS.md for more details.

Citation

@article{ren2020deep,
  title={Deep Image Spatial Transformation for Person Image Generation},
  author={Ren, Yurui and Yu, Xiaoming and Chen, Junming and Li, Thomas H and Li, Ge},
  journal={arXiv preprint arXiv:2003.00696},
  year={2020}
}

@article{ren2020deep,
  title={Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation},
  author={Ren, Yurui and Li, Ge and Liu, Shan and Li, Thomas H},
  journal={IEEE Transactions on Image Processing},
  year={2020},
  publisher={IEEE}
}

Acknowledgement

We build our project base on Vid2Vid. Some dataset preprocessing methods are derived from Pose-Transfer.

global-flow-local-attention's People

Contributors

andrewjong avatar armheb avatar renyurui avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.