Giter Club home page Giter Club logo

pgbig's Introduction

Progressively-Generating-Better-Initial-Guesses-Towards-Next-Stages-forHigh-Quality-Human-Motion-Prediction

Official implementation of Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction (CVPR 2022 paper)

[PDF] [Supp] [Demo]

Authors

  1. Tiezheng Ma, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  3. Chengjiang Long, Meta Reality Labs, USA, [email protected]
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, [email protected]
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, [email protected]

Abstract

    This paper presents a high-quality human motion prediction method that accurately predicts future human poses given observed ones. Our method is mainly based on the observation that a good initial guess of the future pose sequence, such as the mean of future poses, is very helpful to improve the forecasting accuracy. This motivates us to design a novel two-stage prediction strategy, including an init-prediction network that just computes a good initial guess and a formal-prediction network that takes both the historical and initial poses to predict the target pose sequence. We extend this idea further and design a multi-stage prediction framework with each stage predicting initial guess for the next stage, which rewards us with significant performance gain. To fulfill the prediction task at each stage, we propose a network comprising Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). Sequentially executing the two networks can extract spatiotemporal features over the global receptive field of the whole pose sequence effectively. All the above design choices cooperating together make our method outperform previous approaches by a large margin (6%-7% on Human3.6M, 5%-10% on CMU-MoCap, 13%-16% on 3DPW).

Overview

PGBIG

Dependencies

  • Pytorch 1.8.0+cu11
  • Python 3.7
  • Nvidia RTX 2060

DataSet

Human3.6m in exponential map can be downloaded from here.

CMU mocap was obtained from the repo of ConvSeq2Seq paper.

3DPW from their official website.

Train

  • Train on Human3.6M:

python main_h36m.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 66 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

  • Train on CMU-MoCap:

python main_cmu_3d.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 75 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

  • Train on 3DPW:

--data_dir [dataset path] --kernel_size 10 --dct_n 40 --input_n 10 --output_n 30 --skip_rate 1 --batch_size 32 --test_batch_size 32 --in_features 69 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

Note:

  • kernel_size: is the length of used input seqence.

  • d_model: is the latent code dimension of a joint.

  • test_sample_num: is the sample number for test dataset, can be set as {8, 256, -1(all)}. For example, if it is set to 8, it means that 8 samples are sampled for each action as the test set.

After training, the checkpoint is saved in ./checkpoint/.

Test

Add --is_eval after the above training commands.

The test result will be saved in ./checkpoint/.

Citation

If you think our work is helpful to you, please cite our paper.

Ma T, Nie Y, Long C, et al. Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 6437-6446.

Acknowledgments

Our code is based on HisRep and LearnTrajDep

Licence

MIT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.