LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation

This repository is the official implementation of LAMP

LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
Ruiqi Wu, Linagyu Chen, Tong Yang, Chunle Guo, Chongyi Li, Xiangyu Zhang
( * indicates corresponding author)

[Arxiv Paper] [Website Page] [Google Drive] [Baidu Disk (pwd: ffsp)] [Colab Notebook]

🚀 LAMP is a few-shot-based method for text-to-video generation. You only need 8~16 videos 1 GPU (> 15 GB VRAM) for training!! Then you can generate videos with learned motion pattern.

News

[2023/11/02] The Colab demo is released! Thanks for the PR of @ShashwatNigam99.
[2023/10/21] We add Google Drive link about our checkpoints and training data.
[2023/10/17] We release our checkpoints and Arxiv paper.
[2023/10/16] Our code is publicly available.

Preparation

Dependencies and Installation

Ubuntu > 18.04
CUDA=11.3
Others:

# clone the repo
git clone https://github.com/RQ-Wu/LAMP.git
cd LAMP

# create virtual environment
conda create -n LAMP python=3.8
conda activate LAMP

# install packages
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
pip install xformers==0.0.13

Weights and Data

You can download pre-trained T2I diffusion models on Hugging Face. In our work, we use Stable Diffusion v1.4 as our backbone network. Clone the pretrained weights by git-lfs and put them in ./checkpoints
Our checkpoint and training data are listed as followed, you can also collected video data by your own (Suggest websites: pexels, frozen-in-time) and put .mp4 files in ./training_videos/[motion_name]/

Motion Name	Checkpoint Link	Training data
Birds fly	Baidu Disk (pwd: jj0o)	Baidu Disk (pwd: w96b)
Firework	Baidu Disk (pwd: wj1p)	Baidu Disk (pwd: oamp)
Helicopter	Baidu Disk (pwd: egpe)	Baidu Disk (pwd: t4ba)
Horse run	Baidu Disk (pwd: 19ld)	Baidu Disk (pwd: mte7)
Play the guitar	Baidu Disk (pwd: l4dw)	Baidu Disk (pwd: js26)
Rain	Baidu Disk (pwd: jomu)	Baidu Disk (pwd: 31ug)
Turn to smile	Baidu Disk (pwd: 2bkl)	Baidu Disk (pwd: l984)
Waterfall	Baidu Disk (pwd: vpkk)	Baidu Disk (pwd: 2edp)
All	Baidu Disk (pwd: ifsm)	Baidu Disk (pwd: 2i2k)

Get Started

1. Training to learn a motion pattern

CUDA_VISIBLE_DEVICES=X accelerate launch train_lamp.py config="configs/XXX.yaml"

2. Inference

Here is an example command for inference

python inference_script.py \
    --weight ./my_weight/turn_to_smile/unet \
    --pretrain_weight ./checkpoints/stable-diffusion-v1-4 \
    --first_frame_path ./benchmark/head_photo_of_a_cute_girl,_comic_style.png \
    --prompt "head photo of a cute girl, comic style, turns to smile" \
    # default prompt is same to the image's filename
    # [Other optional configs...]
    # --output results/ \ 
    # --height 320 \ 
    #--width 512 \
    #--length 16 \
    #--cfg 12.5 \

Visual Examples

Few-Shot-Based Text-to-Video Generation

Horse run
	A horse runs in the universe.	A horse runs on the Mars.	A horse runs on the road.
Firework
	Fireworks in desert night.	Fireworks over the mountains.	Fireworks in the night city.
Play the guitar
	GTA5 poster, a man plays the guitar.	A woman plays the guitar.	An astronaut plays the guitar, photorealistic.
Birds fly
	Birds fly in the pink sky.	Birds fly in the sky, over the sea.	Many Birds fly over a plaza.

Video Editing

Origin Videos	Editing Result-1	Editing Result-2

	A girl in black runs on the road.	A man runs on the road.

	A man is dancing.	A girl in white is dancing.

Citation

If you find our repo useful for your research, please cite us:

@artical{wu2023lamp,
    title={LAMP: Learn a Motion Pattern by Few-Shot Tuning a Text-to-Image Diffusion Model},
    author={Wu, Ruiqi and Chen, Liangyu and Yang, Tong and Guo, Chunle and Li, Chongyi and Zhang, Xiangyu},
    journal={arXiv preprint arXiv:2310.10769},
    year={2023}
}

License

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.

Acknowledgement

This repository is maintained by Ruiqi Wu. The code is built based on Tune-A-Video. Thanks for the excellent open-source code!!

hongj77 / lamp Goto Github PK

lamp's Introduction

LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation

News

Preparation

Dependencies and Installation

Weights and Data

Get Started

1. Training to learn a motion pattern

2. Inference

Visual Examples

Few-Shot-Based Text-to-Video Generation

Video Editing

Citation

License

Acknowledgement

lamp's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent