Monocular, One-stage, Regression of Multiple 3D People

ROMP is a one-stage network for multi-person 3D mesh recovery from a single image.

Monocular, One-stage, Regression of Multiple 3D People,
Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei,
arXiv paper (arXiv 2008.12272)

Contact: [email protected]. Feel free to contact me for related questions or discussions!

Simple: Simultaneously predicting the body center locations and corresponding 3D body mesh parameters for all people at each pixel.
Fast: ROMP ResNet-50 model runs over 30 FPS on a 1070Ti GPU.
Strong: ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks, including 3DPW, CMU Panoptic, and 3DOH50K.
Easy to use: We provide user friendly testing API and webcam demos.

News

2021/4/19: Adding support for textured SMPL mesh using vedo. See visualization.md for the details.
2021/3/30: 1.0 version. Rebuilding the code. Release the ResNet-50 version and evaluation on 3DPW.
2020/11/26: Optimization for person-person occlusion. Small changes for video support.
2020/9/11: Real-time webcam demo using local/remote server. Please refer to config_guide.md for details.
2020/9/4: Google Colab demo. Saving a npy file per imag. Please refer to config_guide.md for details.

Try on Google Colab

Before installation, you can take a few minutes to try the prepared Google Colab demo a try.
It allows you to run the project in the cloud, free of charge.

Please refer to the bug.md for unpleasant bugs. Welcome to submit the issues for related bugs.

Installation

Please refer to install.md for installation.

Demo

Currently, the released code is used to re-implement demo results. Only 1-2G GPU memory is needed.

To do this you just need to run

cd ROMP/src
sh run.sh
# if there are any bugs about shell script, please consider run the following command instead:
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/single_image.yml

Results will be saved in ROMP/demo/images_results.

Internet images

You can also run the code on random internet images via putting the images under ROMP/demo/images.

Please refer to config_guide.md for saving the estimated mesh/Center maps/parameters dict.

Internet videos

You can also run the code on random internet videos.

To do this you just need to firstly change the input_video_path in src/configs/video.yml to /path/to/your/video. For example, set

 video_or_frame: True
 input_video_path: '../demo/videos/sample_video.mp4' # None
 output_dir: '../demo/videos/sample_video_results/'

then run

cd ROMP/src
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/video.yml

Results will be saved to ../demo/videos/sample_video_results.

Export to Blender FBX

Please refer to expert.md to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with ../demo/videos/sample_video2_results/sample_video2.mp4, whose results would be saved to ../demo/videos/sample_video2_results.

Webcam

We also provide the webcam demo code, which can run at real-time on a 1070Ti GPU / remote server.
Currently, limited by the visualization pipeline, the webcam visulization code only support the single-person mesh.

To do this you just need to run

cd ROMP/src
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/webcam.yml
# or try to use the model with ResNet-50 as backbone.
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/webcam_resnet.yml

Press Up/Down to end the demo. Pelease refer to config_guide.md for running webcam demo on remote server, setting mesh color or camera id.

Evaluation

Please refer to evaluation.md for evaluation on benchmarks.

TODO LIST

The code will be gradually open sourced according to:

Citation

Please considering citing

@inproceedings{ROMP,
  title = {Monocular, One-stage, Regression of Multiple 3D People},
  author = {Yu, Sun and Qian, Bao and Wu, Liu and Yili, Fu and Black, Michael J. and Tao, Mei},
  booktitle = {arxiv:2008.12272},
  month = {August},
  year = {2020}
}

Acknowledgement

We thank Peng Cheng for his constructive comments on Center map training.

Thanks to Marco Musy for his help in the textured SMPL visualization.

Here are some great resources we benefit:

SMPL models and layer is borrowed from MPII SMPL-X model.
Webcam pipeline is borrowed from minimal-hand.
Some functions are borrowed from HMR-pytorch.
Some functions for data augmentation are borrowed from SPIN.
Synthetic occlusion is borrowed from synthetic-occlusion.
The evaluation code of 3DPW dataset is brought from 3dpw-eval.
For fair comparison, the GT annotations of 3DPW dataset are brought from VIBE.
3D mesh visualization is supported by vedo and Open3D.

daijucug / romp Goto Github PK

romp's Introduction

Monocular, One-stage, Regression of Multiple 3D People

News

Try on Google Colab

Installation

Demo

Internet images

Internet videos

Export to Blender FBX

Webcam

Evaluation

TODO LIST

Citation

Acknowledgement

romp's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent