Giter Club home page Giter Club logo

champ's Introduction

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

1Nanjing University 2Fudan University 3Alibaba Group
*Equal Contribution +Corresponding Author
head.mp4

🏆 Framework

framework

🔥 News

  • 2024/04/02: ✨✨✨SMPL & Rendering scripts released! Champ your dance videos now.💃🤸‍♂️🕺

  • 2024/03/30: 🚀🚀🚀Watch this amazing video tutorial from Toy. It's based on the unofficial easy Champ ComfyUI without SMPL from Kijai🥳.

  • 2024/03/27: Cool Demo on replicate🌟, Thanks camenduru!👏

🐟 Installation

  • System requirement: Ubuntu20.04/Windows 11, Cuda 12.1
  • Tested GPUs: A100, RTX3090

Git clone Champ with following command:

  git clone --recurse-submodules https://github.com/fudan-generative-vision/champ

Create conda environment:

  conda create -n champ python=3.10
  conda activate champ

Install packages with pip

  pip install -r requirements.txt

Install packages with poetry

If you want to run this project on a Windows device, we strongly recommend to use poetry.

poetry install --no-root

Install 4D-Humans

Champ use the great work 4D-Humans to fit SMPL on inputs. Please follow their instructions Installation to set it up and Run demo on images to download checkpoints. Note that we have a fork in Champ/4D-Humans, so you don't need to clone the original repository.

💾 Download pretrained models

  1. Download pretrained weight of base models:

  2. Download our checkpoints: \

Our checkpoints consist of denoising UNet, guidance encoders, Reference UNet, and motion module.

Finally, these pretrained models should be organized as follows:

./pretrained_models/
|-- champ
|   |-- denoising_unet.pth
|   |-- guidance_encoder_depth.pth
|   |-- guidance_encoder_dwpose.pth
|   |-- guidance_encoder_normal.pth
|   |-- guidance_encoder_semantic_map.pth
|   |-- reference_unet.pth
|   `-- motion_module.pth
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- DWPose
|   |-- dw-ll_ucoco_384.onnx
|   `-- yolox_l.onnx
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
`-- stable-diffusion-v1-5
    |-- feature_extractor
    |   `-- preprocessor_config.json
    |-- model_index.json
    |-- unet
    |   |-- config.json
    |   `-- diffusion_pytorch_model.bin
    `-- v1-inference.yaml

🐳 Inference

We have provided several sets of example data for inference. Please first download and place them in the example_data folder.

Here is the command for inference:

  python inference.py --config configs/inference.yaml

If using poetry, command is

poetry run python inference.py --config configs/inference.yaml

Animation results will be saved in results folder. You can change the reference image or the guidance motion by modifying inference.yaml.

You can also extract the driving motion from any videos and then render with Blender. We will later provide the instructions and scripts for this.

Note: The default motion-01 in inference.yaml has more than 500 frames and takes about 36GB VRAM. If you encounter VRAM issues, consider switching to other example data with less frames.

💃 SMPL & Rendering

Try Champ with your dance videos! It may take time to setup the environment, follow the instruction step by step🐢, report issue when necessary.

Preprocess

Use ffmpeg to extract frames from video. For example:

ffmpeg -i driving_videos/Video_1/Your_Video.mp4 -c:v png driving_videos/Video_1/images/%04d.png 

Please organize your driving videos and reference images like this:

|-- driving_videos
    |-- Video_1
        |-- images
        	|-- 0000.png
        		 ...
        	|-- 0020.png
        		 ...
    |-- Video_2
        |-- images
        	|-- 0000.png
        		 ...
    ...
    |-- Video_n

|-- reference_imgs
    |-- images
    	|-- your_ref_img_A.png
    	|-- your_ref_img_B.png
                ...

SMPL

Fit SMPL

Make sure you have organized directory as above. Substitute your path as absolute path in following command:

python inference_smpl.py  --reference_imgs_folder test_smpl/reference_imgs --driving_videos_folder test_smpl/driving_videos --device YOUR_GPU_ID

Once finished, you can check reference_imgs/visualized_imgs to see the overlay results. To better fit some extreme figures, you may also append --figure_scale to manually change the figure(or shape) of predicted SMPL, from -10(extreme fat) to 10(extreme slim).

Smooth SMPL (optional)

TODO: Coming Soon.

Transfer SMPL

Replace with absolute path in following command:

python transfer_smpl.py --reference_path test_smpl/reference_imgs/smpl_results/ref.npy --driving_path test_smpl/driving_videos/Video_1 --output_folder test_smpl/transfer_result --figure_transfer --view_transfer

Append --figure_transfer when you want the result matches the reference SMPL's figure, and --view_transfer to transform the driving SMPL onto reference image's camera space.

Rendering

First of all, install Blender in your Server or PC.

Replace with absolute path in following command:

blender smpl_rendering.blend --background --python rendering.py --driving_path test_smpl/transfer_result/smpl_results --reference_path test_smpl/reference_imgs/images/ref.png

This will rendering in CPU on default. Append --device YOUR_GPU_ID to select a GPU for rendering.

Rendering DWPose

Make sure you have finished SMPL rendering. Replace with absolute path in following command:

python inference_dwpose.py --imgs_path test_smpl/transfer_result --device YOUR_GPU_ID

👏 Acknowledgements

We thank the authors of MagicAnimate, Animate Anyone, and AnimateDiff for their excellent work. Our project is built upon Moore-AnimateAnyone, 4D-Humans, DWPose and we are grateful for their open-source contributions.

🕒 Roadmap

Visit our roadmap to preview the future of Champ.

🌟 Citation

If you find our work useful for your research, please consider citing the paper:

@misc{zhu2024champ,
      title={Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance},
      author={Shenhao Zhu and Junming Leo Chen and Zuozhuo Dai and Yinghui Xu and Xun Cao and Yao Yao and Hao Zhu and Siyu Zhu},
      year={2024},
      eprint={2403.14781},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

👋 Opportunities available

Multiple research positions are open at the Generative Vision Lab, Fudan University! Include:

  • Research assistant
  • Postdoctoral researcher
  • PhD candidate
  • Master students

Interested individuals are encouraged to contact us at [email protected] for further information.

champ's People

Contributors

shenhaozhu avatar aricgamma avatar leoooo333 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.