Giter Club home page Giter Club logo

deepsimho's Introduction


DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation

Overview

NeurIPS, 2023
Rong Wang . Wei Mao · Hongdong Li

Paper PDF ArXiv PDF Youtube Video


Installation

Environment

  • Create a conda venv and install pytorch:
conda env create -f environment.yml
conda activate deepsimHO
pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html
  • Install base dependencies:
pip install -r requirements.txt
  • Install additional packages for compatability issues:
pip install kaolin==0.13.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.10.1_cu111.html
pip install jaxlib==0.1.65 -f https://storage.googleapis.com/jax-releases/jax_releases.html
  • Follow the instructions to install dex-ycb-toolkit. Note you should clone it into main/thirdparty instead.

  • [Optional] If you are running on a headless server, you should create virtual displays for MuJoCo:

Xvfb :1 &
export DISPLAY=:1

Data Preperation

  • Follow the docs to prepare datasets for HO3D and DexYCB and download MANO assets, the datasets should be extracted into main/data and main/assets.

  • We provide cached DexYCB and HO3D index for train and test set indicating samples whose ground truth annotations are stable in our settings for MuJoCo, download and extract the folder into main/common.

  • We provide MeshLab resampled YCB object models to train object metrics, download it and extracted to main/data. MuJoCo related data can be found in MuJoCo_data.

  • Your main folder should have the structure as below:

    ├── data
    │   ├── DexYCB
    │   ├── HO3D
    │   ├── models_resample_mashlab
    |   ├── YCB_models_process
    │   └── YCB_models_supp
    ├── assets
    │   ├── postprocess
    │   ├── mano_v1_2
    │   └── ... (download from ArtiBoost)
    ├── checkpoints
    │   ├── pretrained
    │   |   ├── dexycb.pth.tar
    |   |   └── ho3d.pth.tar
    ├── common
    |   ├── cache
    │   |   ├── DexYCB
    |   |   └── HO3D
    ├── thirdparty
    │   ├── dex-ycb-toolkit
    |   ...

Evaluation

HO3Dv2-

Download the pretrained model into main/checkpoints/pretrained and run the below command.

cd main
CUDA_VISIBLE_DEVICES=0 python scripts/main.py --cfg config/test/test_ho3dv2-_artiboost_pretrain.yaml --gpu_id 0 --evaluate'

DexYCB

Download the pretrained model into main/checkpoints/pretrained and run the below command.

cd main
CUDA_VISIBLE_DEVICES=0 python scripts/main.py --cfg config/test/test_dexycb_artiboost_pretrain.yaml --gpu_id 0 --evaluate'

If runs properly, you should find the results like below. Note the physics metrics may vary slightly due to the simulation precision.

HO3D:
test Epoch 0 | final_loss: 3.3620e-03 | joints_3d_abs_mepe: 97.3570 | corners_3d_abs_mepe: 52.7965 | CP: 0.9659 | PD: 0.0113 | SD: 0.0243 | SR: 0.1917: 100%|███| 48/48
DexYCB:
test Epoch 0 | final_loss: 1.7286e+00, sym_corners_3d_loss: 1.7285e+01 | joints_3d_abs_mepe: 11.2349 | CP: 0.9590 | PD: 0.0142 | SD: 0.0242 | SR: 0.3283: 100%|███| 50/50

Train

We take training on DexYCB as the example. First train the DeepSim MLP independently:

python scripts/main.py --cfg config/train/train_dexycb_artiboost_mlp.yaml --gpu_id 0

Then jointly train with a base network (We use pretrained Heatmap-based, Object symmetry model in ArtiBoost):

python scripts/main.py --cfg config/train/train_dexycb_artiboost_pretrain.yaml --gpu_id 0

Note that training with MuJoCo is mostly done in CPU, where we use multi-process simulation for speed up.

Citation

If you use the code for your research, please cite with the below:

@inproceedings{
    wang2023deepsimho,
    title={DeepSim{HO}: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation},
    author={Rong Wang and Wei Mao and Hongdong Li},
    booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
    year={2023},
    url={https://openreview.net/forum?id=SxVHyYavHy}
}

Acknowledge

We use ArtiBoost as the code base and thank the authors for releasing the code.

deepsimho's People

Contributors

rongakowang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.