Giter Club home page Giter Club logo

a2j's Introduction

PWC PWC PWC PWC PWC

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

Introduction

This is the official implementation for the paper, "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image", ICCV 2019.

In this paper, we propose a simple and effective approach termed A2J, for 3D hand and human pose estimation from a single depth image. Wide-range evaluations on 5 datasets demonstrate A2J's superiority.

Please refer to our paper for more details, https://arxiv.org/abs/1908.09999.

pipeline

Update (2021-9-28)

More details of A2J can be found in our slides (https://github.com/zhangboshen/A2J/blob/master/fig/A2J_Boshen_Zhang_public.pptx).

Update (2020-6-16)

We upload A2J's prediction results in pixel coordinates (i.e., UVD format) for NYU and ICVL datasets: https://github.com/zhangboshen/A2J/tree/master/result_nyu_icvl, Evaluation code (https://github.com/xinghaochen/awesome-hand-pose-estimation/tree/master/evaluation) can be applied for performance comparision among SoTA methods.

Update (2020-3-23)

We released our training code here.

If you find our work useful in your research or publication, please cite our work:

@inproceedings{A2J,
author = {Xiong, Fu and Zhang, Boshen and Xiao, Yang and Cao, Zhiguo and Yu, Taidong and Zhou Tianyi, Joey and Yuan, Junsong},
title = {A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image},
booktitle = {Proceedings of the IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}

Comparison with state-of-the-art methods

result_hand result_body

A2J achieves 2nd place in HANDS2019 3D hand pose estimation Challenge

Task 1: Depth-Based 3D Hand Pose Estimation

T1

Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects

T2

About our code

Dependencies

Our code is tested under Ubuntu 16.04 environment with NVIDIA 1080Ti GPU, both Pytorch0.4.1 and Pytorch1.2 work (Pytorch1.0/1.1 should also work).

code

First clone this repository:

git clone https://github.com/zhangboshen/A2J
  • src folder contains model definition, anchor, and test files for NYU, ICVL, HANDS2017, ITOP, K2HPD datasets.
  • data folder contains center point, bounding box, mean/std, and GT keypoints files for 5 datasets.

Next you may download our pre-trained model files from:

Directory structure of this code should look like:

A2J
│   README.md
│   LICENSE.md  
│
└───src
│   │   ....py
└───data
│   │   hands2017
│   │   icvl
│   │   itop_side
│   │   itop_top
│   │   k2hpd
│   │   nyu
└───model
│   │   HANDS2017.pth
│   │   ICVL.pth
│   │   ITOP_side.pth
│   │   ITOP_top.pth
│   │   K2HPD.pth
│   │   NYU.pth

You may also have to download these datasets manually:

  • NYU Hand Pose Dataset [link]
  • ICVL Hand Pose Dataset [link]
  • HANDS2017 Hand Pose Dataset [link]
  • ITOP Body Pose Dataset [link]
  • K2HPD Body Pose Dataset [link]

After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files.

Finally, simply run DATASET_NAME.py in the src folder to test our model. For example, you can reproduce our HANDS2017 results by running:

python hands2017.py

There are some optional configurations you can adjust in the DATASET_NAME.py files.

Thanks Gyeongsik et al. for their nice work to provide precomputed center files (https://github.com/mks0601/V2V-PoseNet_RELEASE) for NYU, ICVL, HANDS2017 and ITOP datasets. This is really helpful to our work!

Qualitative Results

NYU hand pose dataset:

NYU_1  

ITOP body pose dataset:

ITOP_1

a2j's People

Contributors

zhangboshen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

a2j's Issues

How to compute mean/std?

Hello, thanks for your interesting work! I wonder how to compute mean/std in the data folder, i.e. are they some statistics on training set? Looking forward to your reply!

data_preprosess.py is missing

@zhangboshen , great impressive work! Thanks for sharing!

In your readme.md, you mentioned the following,

"After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files."

But I could not find it. Do you really have that file and need it?

Where does mean/std deviation come from?

For each dataset, there are files called <dataset>_mean.npy and <dataset>_std.npy. For example, icvl_mean.npy and icvl_std.npy.

These appear to be the mean and standard deviation of something, and they are used in the preprocessing step of input images. Can anyone explain where these numbers come from and how they would affect using one of the pretrained models in a realtime situation?

The predicted key points are poorly positioned

Hi,I used itop_side.py to predict the position of the keypoints and found that the position of the keypoints in the variable pred_keypoints was much different from that in the tag, but the accuracy of the final model was very high. Why?

icvl.py

HEy good work,
1.depth = scio.loadmat(self.trainingImageDir + str(self.validIndex[index]+1) + '.mat')['img'](line 141 in icvl.py)
where is the mat file to be given in this line i couldnt see in our code?

Your work seems to be good,As a output I'm output I'm getting this when i run itop_side.py and itop_top.py,Is that correct?

output are:-python itop_side.py
406it [00:27, 15.02it/s]
('Accuracy:', 0)
('joint_', 0, 'Head', ', accuracy: ', 0)
('joint_', 1, 'Neck', ', accuracy: ', 0)
('joint_', 2, 'RShoulder', ', accuracy: ', 0)
('joint_', 3, 'LShoulder', ', accuracy: ', 0)
('joint_', 4, 'RElbow', ', accuracy: ', 0)
('joint_', 5, 'LElbow', ', accuracy: ', 0)
('joint_', 6, 'RHand', ', accuracy: ', 0)
('joint_', 7, 'LHand', ', accuracy: ', 0)
('joint_', 8, 'Torso', ', accuracy: ', 0)
('joint_', 9, 'RHip', ', accuracy: ', 0)
('joint_', 10, 'LHip', ', accuracy: ', 0)
('joint_', 11, 'RKnee', ', accuracy: ', 0)
('joint_', 12, 'LKnee', ', accuracy: ', 0)
('joint_', 13, 'RFoot', ', accuracy: ', 0)
('joint_', 14, 'LFoot', ', accuracy: ', 0)
I couldnt get the output of images in matplot graph?
MAy know if anyone isssues

Hands2019 chanllenge code

Hello, thanks for your code. Can you provide the code for this chanllenge or method in detail?

Best
Weiguo Zhou

aboout bndbox

请问大佬 这个可以做到real-time吗,在实际场景用是不是还要多加一个检测算Bndbox呢, 如果直接加载k2hpd的Bndbox
效果可以吗

Will you release the training code?

Hi, thank you for your great work.
I'm intending to train this model on my own dataset. Will you release the training code?
Thank you very much!

Data augmentation problem

Hi, @zhangboshen ! Thank you for this work!
When I go through scr_train/nyu.py, I notice there is a parameter called RandomOffsetDepth.

RandomOffsetDepth[np.where(RandomOffsetDepth < RandshiftDepth)] = 0

It seems that all values in it are 0. And it has never been used after this line. Is it necessary to apply this augmentation?

for training

I really want to recover your excellent work completely, so I also want to know the following questions;If I want to train the model of k2hpd by myself, can you refine the work in detail? I found the training code in the src_train folder, but I don't know if this code can be used to train [k2hpd]?thank you!

training time

hello, i am interested in your work. I wonder the training time of your network. can you tell me about it ? thank you very much!

icvl code

Hello, thanks for your code. But when i run icvl.py, i wonder what does
''validIndex_train = np.load('/data/zhangboshen/CODE/Anchor_Pose_fpn/data/ICVL/validIndex.npy')
TrainImgFrames = len(validIndex_train)''
stand for, and i didn't find validIndex.npy in the data file you provided. I wonder what should i take care of if i write the train code, i think validIndex.npy has something to do with the train code. Thank you.

How to generate center and bounding box for each depth image?

I'm appreciating your great work.

For my own human pose dataset, in order to train the my own model:

  1. How to generate the center and bounding box for each depth image?
  2. How to generate the mean/standard deviation for each depth image?

For above questions, would you please share the method or algorithm?

Demo code

Thanks for sharing your project. Is there a demo script you can share to test it live with a depth camera?

offset & code

Thank you very much for your excellent code, but I would like to ask you a few questions:

  1. What is the purpose of introducing offset?
  2. In the code file, you have center point, boundary box, mean / standard deviation
    What are the specific functions of GT key point documents?

About the generalization capability of the model

I also found that the effect of using this model in my own pictures was very bad. I was wondering whether it was because my own depth pictures had not been standardized by mean and STD.Has anyone solved this problem?In theory, this data set should be generalizable to your own depth map using ITOP.


我也发现在自己的图片中使用这个模型的效果十分糟糕,我在想是不是因为我自己的深度图片没有经过mean和std的标准化处理。请问有人解决这个问题了吗?理论上,使用ITOP这个数据集应该是可以泛化到自己的深度图中的。

What does Tji mean?

Hi Zhang, thanks for your awesome work.

What does Tji mean?

For joint j, in-plain target Tji denotes the 2D ground-truth in pixel coordinate transformed according to the cropped region.

Is Tji a gaussian heamap or circle-plate heatmap as descriped in GRMI?

More details are welcome.

Much thanks.

Cannot test with my own pictures with hands2017.py

Hello! Your work is nice and I try to use your code to test my own pictures,but I get some problems.

I downloaded the pretrained model and try to test with hands2017.py with my own pictures(15172 pieces of pictures and their names are "image_D00000001.png, image_D00000002.png ... image_D00015172.png" ).

All pictures I put into are gray scale.

I only changed line 37 in hands2017.py, which is testingImageDir = "my own data path".

But I got the error below:

(venv) D:\Hand\A2J\src>python hands2017.py
1264it [03:38, 9.41it/s]Traceback (most recent call last):
File "hands2017.py", line 220, in
main()
File "hands2017.py", line 161, in main
for i, (img, label) in tqdm(enumerate(test_dataloaders)):
File "D:\Hand\venv\lib\site-packages\tqdm\std.py", line 1087, in iter
for obj in iterable:
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 568, in next
return self._process_next_batch(batch)
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\A2J\src\hands2017.py", line 133, in getitem
depth = Image.open(self.ImgDir + 'image_D%08d' % (index+1) + '.png')
File "D:\Hand\venv\lib\site-packages\PIL\Image.py", line 2766, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'D:/Hand/data/gray2/image_D00015173.png'

1264it [03:38, 5.79it/s]

Did I do anything wrong?

centre_pixel

Hello! Thanks for your code!
I have a question that how to compute centre_pixel( icvl_center_test.mat ) in ICVL dataset? I have tried to average every 16 joints but failed to generate the same data as yours. Could you provide some details in this regard?
best wishes!

the result of ICVL

Hello,thanks for your code. I run the icvl.py, but the result shows that error is 105.21951, i wonder which step would be wrong for this result.

Data augmentation

Thanks for sharing the testing code. I'm trying to reproduce the training code would like to know the implementation details for data augmentation.

. Random in-plain rotation: what are the parameters used?
. Random scaling for both in-plain and depth dimension: is each dimension scaled independently and what are the parameters used?
. Random gaussian noise is also randomly added with the probability of 0.5: which dimensions is noise added to and what are the parameters used?
. For each image in the training set, is data augmentation performed 3 times (rotation, scaling, adding noise)?
. Does data augmentation increase the number of samples in the training set (i.e. both original images and augmented images are used)?

Result visualization

Hello,
Is there any way to visualize the results like the images shown in your paper?
Thanks

the meaning of the shape of tensor

Hi @zhangboshen, thanks for the good work!

May I ask what is the meaning of the function 'shift' https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L26?

What is the meaning of N, A and P in N*(whA)*P
in https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L67

why the softmax is on classfication head output instead of anchors? (https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L121)

What is the meaning and use of the function 'post-process'? https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L44

Thanks!

Center file computation of your paper

May I know how to calculate the center files of the datasets you are using? I would like to do the same process to calculate for the BigHand2.2m dataset since it is not available. If you do have it, please let me know.
You report that you use precomputed centers from Gyeongsik et al. but when checking their values with yours, the x and y components are different (eg. 58.026760101318 -32.749855041504 404.03219604492 is the first center according to them while yours is 384.1734, 206.7794 and 404.0322)

Can you provide me .h5 files?

There is no .h5 files for converting images into mat format of the files !
Can i have it?
line13 in datapreprocesses.py-(/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_depth_map.h5)
line 14 in datapreprocesses.py-("/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_labels.h5") THis files are used for converting the images of png or jpg format to the mat files

Problem while retraining A2J on NYU

Hello,
My appreciations to your great work! While trying to reproduce A2J by retraining on NYU, I am getting "inf" for regression loss. My environment is :
Windows10, cuda10.1, cudnn7.6.1, pytorch1.5.1.
I am using the same hyperparameters as used in nyu.py. It seems that you are using pretrained Resnet50 in a finetunning mode with none of its weights freezed, I am right? Your help is extremly appreciated!

Can't execute the training code

Thank you for sharing your project code. It is really helpful to understand your paper.
BTW, I succeeded in running the inference code src/nyu.py properly
but I got some errors as follows when I ran the training code src_train/nyu.py in the commit 6ab7346.

Traceback (most recent call last):
  File "nyu.py", line 10, in <module>
    import model as model
  File "D:\A2J\src_train\model.py", line 3, in <module>
    import resnet
ModuleNotFoundError: No module named 'resnet'

I fixed this error by importing torchvision.models instead of resnet in model.py.

# import resnet
import torchvision.models as models

I also fixed the line 150 to use torchvision.models according to the modification above.

        # modelPreTrain50 = resnet.resnet50(pretrained=True)
        modelPreTrain50 = models.resnet50(pretrained=True)

With all the modification above, however, I still got another error in anchor.py as follows.

Traceback (most recent call last):
  File "nyu.py", line 449, in <module>
    train()
  File "nyu.py", line 279, in train
    Cls_loss, Reg_loss = criterion(heads, label)
  File "D:\A2J\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\A2J\src_train\anchor.py", line 134, in forward
    reg = torch.unsqueeze(anchor,1) + regression #(w*h*A)*P*2
RuntimeError: The size of tensor a (1936) must match the size of tensor b (576) at non-singleton dimension 0

Could you please investigate the root cause of these errors? Did I miss something?

My environments are:
operating oystem

  • Microsoft Windows 10

programming language

  • Python 3.7.6
  • PyTorch 1.4.0

你好 K2hpd数据

在src/K2hpd.py 19行里面
keypointsNumber = 19
但是 k2hpd的 ground_truth关键点是只有15的啊?

How to obtain the center point coordinates and depth values during NYU inference

I'm appreciating your great work.

when i test own depth data, I can't get the center coordinates and depth values of hand bbox advance. Therefore, when I try to remove "- center [index] [0] [2]" during training, the loss cannot converge, Can you provide some help for me to adjust the parameters so that the network can converge, thanks !

spell mistake "data_preprosess.mat"

I think right spell is “data_preprocess.mat”

mistake file name:
/data/icvl/data_preprosess.mat
/data/itop_side/data_preprosess.py
/data/k2hpd/data_preprosess.py
/data/nyu/data_preprosess.mat

self.thres is unused in anchors.py

The self.thres variable (present in both A2J_loss and post_process nn.Modules) is given a default value, yet never gets used. What is its purpose?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.