Giter Club home page Giter Club logo

u-conddgcn's Introduction

U-CondDGCN: Conditional Directed Graph Convolution for 3D Human Pose Estimation

This repo is the unofficial implementation of "Conditional Directed Graph Convolution for 3D Human Pose Estimation, Wenbo Hu, Changgong Zhang, Fangneng Zhan, Lei Zhang, Tien-Tsin Wong" in PyTorch. There are many omitted parts in the paper. Therefore, note that there may be differences between actual papers and the way they are implemented. We welcome feedback on implementation errors.

Abstract

Graph convolutional networks have significantly improved 3D human pose estimation by representing the human skeleton as an undirected graph. However, this representation fails to reflect the articulated characteristic of human skeletons as the hierarchical orders among the joints are not explicitly presented. In this paper, we propose to represent the human skeleton as a directed graph with the joints as nodes and bones as edges that are directed from parent joints to child joints. By so doing, the directions of edges can explicitly reflect the hierarchical relationships among the nodes. Based on this representation, we further propose a spatial-temporal conditional directed graph convolution to leverage varying non-local dependence for different poses by conditioning the graph topology on input poses. Altogether, we form a U-shaped network, named U-shaped Conditional Directed Graph Convolutional Network, for 3D human pose estimation from monocular videos. To evaluate the effectiveness of our method, we conducted extensive experiments on two challenging large-scale benchmarks: Human3.6M and MPI-INF-3DHP. Both quantitative and qualitative results show that our method achieves top performance. Also, ablation studies show that directed graphs can better exploit the hierarchy of articulated human skeletons than undirected graphs, and the conditional connections can yield adaptive graph topologies for different poses.

Visualization

Dependencies

  • Cuda 11.1
  • Python 3.8.11
  • Pytorch 1.9.0+cu111

Dataset setup

Please download the dataset from Human3.6M website and refer to VideoPose3D to set up the Human3.6M dataset ('./dataset' directory).

${POSE_ROOT}/
|-- dataset
|   |-- data_3d_h36m.npz
|   |-- data_2d_h36m_cpn_ft_h36m_dbb.npz

Test the model

To test on pretrained model on Human3.6M:

python main.py --reload --previous_dir 'checkpoint/pretrained'

Here, we compare our U-CondDGCN with recent state-of-the-art methods on Human3.6M dataset. Evaluation metric is Mean Per Joint Position Error (MPJPE) in mmโ€‹.

Types Models MPJPE
TCN VideoPose3D 46.8
ViT PoseFormer 44.3
ViT MHFormer 43.0
GCN UGCN 45.6
GCN U-CondDGCN 43.4

Train the model

To train on Human3.6M:

python main.py --train

Citation

If you find our work useful in your research, please consider citing:

@misc{hu2021conditional,
      title={Conditional Directed Graph Convolution for 3D Human Pose Estimation}, 
      author={Wenbo Hu and Changgong Zhang and Fangneng Zhan and Lei Zhang and Tien-Tsin Wong},
      year={2021},
      eprint={2107.07797},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Our code is extended from the following repositories. We thank the authors for releasing the codes.

Licence

This project is licensed under the terms of the MIT license.

u-conddgcn's People

Contributors

tamasino52 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

u-conddgcn's Issues

Node updating

x = torch.stack([torch.matmul(e[:, 0], self.A[0, 1:]), x[:, 1], torch.matmul(e[:, 2], self.A[2, 1:])], dim=1)

hi,
Are there bugs in incoming edge torch.matmul(e[ : , 0], self.A[0, 1: ]) ?
Edge (0, 1), (0, 7), (0, 4) are from the hip of a human body, if using self.A[0, 1: ], the first row of A is omitted.
From the point of view of incoming, Vertices 1, 4, 7 would get the edge from vertex 0.
suppose that torch.matmul(e[ : , 0], B)
B = [0 I]
B is 16 x 17. I is 16 x 16 identity. 0 is 16 x 1 zero column.

Reproduced result doesn't match. Could you please specify the right training procedure?

I'm getting the result below after training and testing.

===Action=== ==MPJPE===
Directions 116.01
Discussion 120.08
Eating 151.63
Greeting 120.19
Phoning 143.64
Photo 149.71
Posing 113.48
Purchases 136.89
Sitting 191.45
SittingDown 212.96
Smoking 135.34
Waiting 128.29
WalkDog 133.10
Walking 106.61
WalkTogether 110.99
Average 138.02
mpjpe: 138.02

Could you please specify the right training procedure? Do I need to pass any additional arguments?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.