Ev2Hands: 3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera [3DV'24]

Official PyTorch implementation

Abstract

3D hand tracking from a monocular video is a very challenging problem due to hand interactions, occlusions, left-right hand ambiguity, and fast motion. Most existing methods rely on RGB inputs, which have severe limitations under low-light conditions and suffer from motion blur. In contrast, event cameras capture local brightness changes instead of full image frames and do not suffer from the described effects, but, unfortunately, existing image-based techniques cannot be directly applied to events due to significant differences in the data modalities. In response to these challenges, this paper introduces the first framework for 3D tracking of two fast-moving and interacting hands from a single monocular event camera. Our approach tackles the left-right hand ambiguity with a novel semi-supervised feature-wise attention mechanism and integrates an intersection loss to fix hand collisions. To facilitate advances in this research domain, we release a new synthetic large-scale dataset of two interacting hands, Ev2Hands-S, and a new real benchmark with real event streams and ground-truth 3D annotations, Ev2Hands-R. Our approach outperforms existing methods in terms of the 3D reconstruction accuracy and generalizes to real data under severe light conditions.

Advantages of Event Based Vision

High Speed Motion	Low Light Performance

Ev2Hands

Usage

Installation
Datasets
- Ev2Hands-S
- Ev2Hands-R
Training
- Pretraining on Ev2Hands-S
- Finetuning on Ev2Hands-R
Evaluation
- Evaluation on Ev2Hands-S
- Evaluation on Ev2Hands-R
Demo
Citation
License

Installation

Clone the repository

https://github.com/Chris10M/Ev2Hands.git
cd Ev2Hands

Data Prerequisites

Data Folder

Please register at the MANO website and download the MANO models.
Put the MANO models in a folder with the following structure.
Download the data from here and ensure that the files and folders have the following structure.

src
|
└── data
      |
      └── models
      |     |
      |     └── mano
      |           ├── MANO_RIGHT.pkl
      |           └── MANO_LEFT.pkl
      └── background_images
      └── mano_textures
      └── J_regressor_mano_ih26m.npy
      └── TextureBasis

Pretrained Model

The pretrained model best_model_state_dict.pth can be found here. Please place the model in the following folder structure.

src
|
└── Ev2Hands
       |
       └── savedmodels
               |
               └── best_model_state_dict.pth

Hand Simulator

Create a conda enviroment from the file

conda env create -f hand_simulator.yml

Install VID2E inside src/HandSimulator following the instructions from here. This is needed for generating events from the synthetic data.

Ev2Hands

Create a conda enviroment from the file

conda env create -f ev2hands.yml

Install torch-mesh-isect following the instructions from here. This is needed for the Intersection Aware Loss.

Note: To compile torch-mesh-isect, we found pytorch=1.4.0 with cuda10.1 works without any issues.

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Datasets

Ev2Hands-S

Download

The Ev2Hands-S can be downloaded from here. Unzip and set ROOT_TRAIN_DATA_PATH in src/settings.py to the path of the downloaded dataset.

Generation

Download InterHand2.6M (5 fps), and set INTERHAND_ROOT_PATH in src/settings.py to the directory of InterHand2.6M_5fps_batch1
Set the path to generate Ev2Hands-S, ROOT_TRAIN_DATA_PATH in src/settings.py. Now, run
```
cd src/HandSimulator
GENERATION_MODE={'train', 'val', 'test'} python main.py
```
to generate the dataset in parts. Note: Set GENERATION_MODE as either train, val or test.
After the dataset parts are generated, we stich the parts to get a H5 event dataset and an annnotation pickle using,
```
cd src/HandSimulator
GENERATION_MODE={'train', 'val', 'test'} python stich_mp.py 
```
Note: Set GENERATION_MODE as either train, val or test.
For generating the dataset using SLURM, please check src/HandSimulator/slurm_main.sh, src/HandSimulator/slurm_stich_mp.sh

Ev2Hands-R

Download

The Ev2Hands-R can be downloaded from here. Unzip and set the train and test folders of Ev2Hands-R in src/settings.py

REAL_TRAIN_DATA_PATH = 'path/to/Ev2Hands-R/train'
REAL_TEST_DATA_PATH = 'path/to/Ev2Hands-R/test'

Training

Pretraining

For pretraining, ensure Ev2Hands-S is present. Please check src/Ev2Hands/arg_parser.py for default batch size and checkpoint path.

cd src/Ev2Hands
python train.py

Finetuning

For finetuning, ensure Ev2Hands-R is present. Please check src/Ev2Hands/arg_parser.py for default batch size and checkpoint path.

cd src/Ev2Hands
python finetune.py

Evaluation

Ev2Hands-S

For evaluation, ensure Ev2Hands-S is present. Note: The provided model will not work well with Ev2Hands-S as it is finetuned for Ev2Hands-R.

cd src/Ev2Hands
python evaluate.py

Ev2Hands-R

For evaluation, ensure Ev2Hands-R is present.

cd src/Ev2Hands
python evaluate_ev2hands_r.py

Demo

Performs inference on real event streams. A trained model is needed for demo to work. You can also use the pretrained model.

To run the demo,

cd src/Ev2Hands
python3 demo.py

Please check src/Ev2Hands/arg_parser.py for default batch size and checkpoint path.

Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{Millerdurai_3DV2024, 
    title={3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera}, 
    author={Christen Millerdurai and Diogo Luvizon and Viktor Rudnev and André Jonas and Jiayi Wang and Christian Theobalt and Vladislav Golyanik}, 
    booktitle = {International Conference on 3D Vision (3DV)}, 
    year={2024} 
}

License

Ev2Hands is under CC-BY-NC 4.0 license. The license also applies to the pre-trained models.

Subject	Ev2Hands-R (`subject_{id}_event.pickle`)	MANO_Parameters (`event_mano_params.pkl`)	MANO_Parameters (`rgb_mano_params.pkl`)	RGB_Data (`subject_{id}_rgb.mp4`)
1	11384	11384	11383	11385
2	11777	11777	11777	11778
3	11921	11921	11921	11922
4	13275	13275	13275	13276
5	12386	12386	12385	12387

chris10m / ev2hands Goto Github PK

ev2hands's Introduction

Ev2Hands: 3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera [3DV'24]

Official PyTorch implementation

Abstract

Advantages of Event Based Vision

Ev2Hands

Usage

Installation

Data Prerequisites

Data Folder

Pretrained Model

Hand Simulator

Ev2Hands

Datasets

Ev2Hands-S

Download

Generation

Ev2Hands-R

Download

Training

Pretraining

Finetuning

Evaluation

Ev2Hands-S

Ev2Hands-R

Demo

Citation

License

ev2hands's People

Contributors

Stargazers

Watchers

Forkers

ev2hands's Issues

Recommend Projects

Recommend Topics

Recommend Org