zhangboshen / a2j Goto Github PK

Code for paper "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image". ICCV2019

License: MIT License

MATLAB 2.03% Python 97.97%

pose-estimation 3d-pose-estimation a2j iccv2019 depth-image pytorch hand-pose-estimation real-time hands2017 hands2019

a2j's Introduction

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

Introduction

This is the official implementation for the paper, "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image", ICCV 2019.

In this paper, we propose a simple and effective approach termed A2J, for 3D hand and human pose estimation from a single depth image. Wide-range evaluations on 5 datasets demonstrate A2J's superiority.

Please refer to our paper for more details, https://arxiv.org/abs/1908.09999.

Update (2021-9-28)

More details of A2J can be found in our slides (https://github.com/zhangboshen/A2J/blob/master/fig/A2J_Boshen_Zhang_public.pptx).

Update (2020-6-16)

We upload A2J's prediction results in pixel coordinates (i.e., UVD format) for NYU and ICVL datasets: https://github.com/zhangboshen/A2J/tree/master/result_nyu_icvl, Evaluation code (https://github.com/xinghaochen/awesome-hand-pose-estimation/tree/master/evaluation) can be applied for performance comparision among SoTA methods.

Update (2020-3-23)

We released our training code here.

If you find our work useful in your research or publication, please cite our work:

@inproceedings{A2J,
author = {Xiong, Fu and Zhang, Boshen and Xiao, Yang and Cao, Zhiguo and Yu, Taidong and Zhou Tianyi, Joey and Yuan, Junsong},
title = {A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image},
booktitle = {Proceedings of the IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}

Comparison with state-of-the-art methods

A2J achieves 2nd place in HANDS2019 3D hand pose estimation Challenge

Task 1: Depth-Based 3D Hand Pose Estimation

Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects

About our code

Dependencies

Our code is tested under Ubuntu 16.04 environment with NVIDIA 1080Ti GPU, both Pytorch0.4.1 and Pytorch1.2 work (Pytorch1.0/1.1 should also work).

code

First clone this repository:

git clone https://github.com/zhangboshen/A2J

src folder contains model definition, anchor, and test files for NYU, ICVL, HANDS2017, ITOP, K2HPD datasets.
data folder contains center point, bounding box, mean/std, and GT keypoints files for 5 datasets.

Next you may download our pre-trained model files from:

Baidu Yun: https://pan.baidu.com/s/10QBT7mKEyypSkZSaFLo1Vw
Google Drive: https://drive.google.com/open?id=1fGe3K1mO934WPZEkHLCX7MNgmmgzRX4z

Directory structure of this code should look like:

A2J
│   README.md
│   LICENSE.md  
│
└───src
│   │   ....py
└───data
│   │   hands2017
│   │   icvl
│   │   itop_side
│   │   itop_top
│   │   k2hpd
│   │   nyu
└───model
│   │   HANDS2017.pth
│   │   ICVL.pth
│   │   ITOP_side.pth
│   │   ITOP_top.pth
│   │   K2HPD.pth
│   │   NYU.pth

You may also have to download these datasets manually:

NYU Hand Pose Dataset [link]
ICVL Hand Pose Dataset [link]
HANDS2017 Hand Pose Dataset [link]
ITOP Body Pose Dataset [link]
K2HPD Body Pose Dataset [link]

After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files.

Finally, simply run DATASET_NAME.py in the src folder to test our model. For example, you can reproduce our HANDS2017 results by running:

python hands2017.py

There are some optional configurations you can adjust in the DATASET_NAME.py files.

Thanks Gyeongsik et al. for their nice work to provide precomputed center files (https://github.com/mks0601/V2V-PoseNet_RELEASE) for NYU, ICVL, HANDS2017 and ITOP datasets. This is really helpful to our work!

Qualitative Results

NYU hand pose dataset:

ITOP body pose dataset:

a2j's People

Contributors

Stargazers

Watchers

a2j's Issues

How to compute mean/std?

Hello, thanks for your interesting work! I wonder how to compute mean/std in the data folder, i.e. are they some statistics on training set? Looking forward to your reply!

an you provide the mat file of the detection bounding boxes of the itop side and top training set

hello， Can you provide the mat file of the detection bounding boxes of the itop side and top training set？ look forward to your reply，thanks！

Can you provide Visualization code?

Can you provide Visualization code about MSRA/ICVL/NYU dataset? How to draw 3D hand joint?

data_preprosess.py is missing

@zhangboshen , great impressive work! Thanks for sharing!

In your readme.md, you mentioned the following,

"After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files."

But I could not find it. Do you really have that file and need it?

Where does mean/std deviation come from?

For each dataset, there are files called <dataset>_mean.npy and <dataset>_std.npy. For example, icvl_mean.npy and icvl_std.npy.

These appear to be the mean and standard deviation of something, and they are used in the preprocessing step of input images. Can anyone explain where these numbers come from and how they would affect using one of the pretrained models in a realtime situation?

The predicted key points are poorly positioned

Hi,I used itop_side.py to predict the position of the keypoints and found that the position of the keypoints in the variable pred_keypoints was much different from that in the tag, but the accuracy of the final model was very high. Why?

Won't you post the code for inference and testing out on videos?

I want to test your model on my videos. Do you have code for inference so that I can test your work out? If not, how can I do it myself. Please guide me. I am a newbie. So, I am having a hard time understanding how specific things are working.

How to get training code?

Thanks for sharing your inference code. Can i get the training code?

icvl.py

HEy good work,
1.depth = scio.loadmat(self.trainingImageDir + str(self.validIndex[index]+1) + '.mat')['img'](line 141 in icvl.py)
where is the mat file to be given in this line i couldnt see in our code?

Your work seems to be good,As a output I'm output I'm getting this when i run itop_side.py and itop_top.py,Is that correct?

output are:-python itop_side.py
406it [00:27, 15.02it/s]
('Accuracy:', 0)
('joint_', 0, 'Head', ', accuracy: ', 0)
('joint_', 1, 'Neck', ', accuracy: ', 0)
('joint_', 2, 'RShoulder', ', accuracy: ', 0)
('joint_', 3, 'LShoulder', ', accuracy: ', 0)
('joint_', 4, 'RElbow', ', accuracy: ', 0)
('joint_', 5, 'LElbow', ', accuracy: ', 0)
('joint_', 6, 'RHand', ', accuracy: ', 0)
('joint_', 7, 'LHand', ', accuracy: ', 0)
('joint_', 8, 'Torso', ', accuracy: ', 0)
('joint_', 9, 'RHip', ', accuracy: ', 0)
('joint_', 10, 'LHip', ', accuracy: ', 0)
('joint_', 11, 'RKnee', ', accuracy: ', 0)
('joint_', 12, 'LKnee', ', accuracy: ', 0)
('joint_', 13, 'RFoot', ', accuracy: ', 0)
('joint_', 14, 'LFoot', ', accuracy: ', 0)
I couldnt get the output of images in matplot graph?
MAy know if anyone isssues

Can you provide the mat file of the detection bounding boxes of the itop side and top training set

hello， Can you provide the mat file of the detection bounding boxes of the itop side and top training set？ look forward to your reply，thanks！

Hands2019 chanllenge code

Hello, thanks for your code. Can you provide the code for this chanllenge or method in detail?

Best
Weiguo Zhou

aboout bndbox

请问大佬这个可以做到real-time吗，在实际场景用是不是还要多加一个检测算Bndbox呢, 如果直接加载k2hpd的Bndbox
效果可以吗

Will you release the training code?

Hi, thank you for your great work.
I'm intending to train this model on my own dataset. Will you release the training code?
Thank you very much!

How to compute bounding box for ITOP dataset?

Hi @zhangboshen .
I'm trying to train A2J model on ITOP dataset. Since the bounding box file for the training set is not provided, could you please tell me how to obtain bounding boxes? Many thanks!

Training Code

Thanks for sharing your inference code. I was wondering if your issues previously mentioned are resolved now so you can share your training code as well, thanks!

How to make gt for informative anchor proposal branch and depth estimation branch?

Hello!
You're work is very interesting but i have a question that how to make gt for informative anchor proposal branch and depth estimation branch?

Data augmentation problem

Hi, @zhangboshen ! Thank you for this work!
When I go through scr_train/nyu.py, I notice there is a parameter called RandomOffsetDepth.

A2J/src_train/nyu.py

Line 153 in 5848571

RandomOffsetDepth[np.where(RandomOffsetDepth < RandshiftDepth)] = 0

It seems that all values in it are 0. And it has never been used after this line. Is it necessary to apply this augmentation?

for training

I really want to recover your excellent work completely, so I also want to know the following questions;If I want to train the model of k2hpd by myself, can you refine the work in detail? I found the training code in the src_train folder, but I don't know if this code can be used to train [k2hpd]?thank you！

training time

hello, i am interested in your work. I wonder the training time of your network. can you tell me about it ? thank you very much!

icvl code

Hello, thanks for your code. But when i run icvl.py, i wonder what does
''validIndex_train = np.load('/data/zhangboshen/CODE/Anchor_Pose_fpn/data/ICVL/validIndex.npy')
TrainImgFrames = len(validIndex_train)''
stand for, and i didn't find validIndex.npy in the data file you provided. I wonder what should i take care of if i write the train code, i think validIndex.npy has something to do with the train code. Thank you.

How to generate center and bounding box for each depth image?

I'm appreciating your great work.

For my own human pose dataset, in order to train the my own model:

How to generate the center and bounding box for each depth image?
How to generate the mean/standard deviation for each depth image?

For above questions, would you please share the method or algorithm?

Demo code

Thanks for sharing your project. Is there a demo script you can share to test it live with a depth camera?

offset & code

Thank you very much for your excellent code, but I would like to ask you a few questions:

What is the purpose of introducing offset?
In the code file, you have center point, boundary box, mean / standard deviation
What are the specific functions of GT key point documents?

Can you provide me ITOP_top.pth file ?

In itop_top.py there is no file ITOP_top.pth in the repository can help me providing this file

How can I train this model by my pictures?

I try to use the itop_side.pth to identify my pictures, but it's a little poor.
So I want to use my pictures to train this model, how can I do this?

About the generalization capability of the model

I also found that the effect of using this model in my own pictures was very bad. I was wondering whether it was because my own depth pictures had not been standardized by mean and STD.Has anyone solved this problem?In theory, this data set should be generalizable to your own depth map using ITOP.

我也发现在自己的图片中使用这个模型的效果十分糟糕，我在想是不是因为我自己的深度图片没有经过mean和std的标准化处理。请问有人解决这个问题了吗？理论上，使用ITOP这个数据集应该是可以泛化到自己的深度图中的。

Code for calculating mean and std

Hi I would like to understand how the mean and std for the images are obtained. Would you be able to provide the code? Thanks.

when the code will be available?

Hello!

You're work is very interesting . when the code will be available online?

Thank you!!

Which file does each line of 3D data point correspond to?

I got the center point of gesture correction from v2v(https://github.com/mks0601/V2V-PoseNet_RELEASE), but how does the data correspond to the file? Which file does each line of 3D data point correspond to?

What does Tji mean?

Hi Zhang, thanks for your awesome work.

What does Tji mean?

For joint j, in-plain target Tji denotes the 2D ground-truth in pixel coordinate transformed according to the cropped region.

Is Tji a gaussian heamap or circle-plate heatmap as descriped in GRMI?

More details are welcome.

Much thanks.

Cannot test with my own pictures with hands2017.py

Hello! Your work is nice and I try to use your code to test my own pictures,but I get some problems.

I downloaded the pretrained model and try to test with hands2017.py with my own pictures(15172 pieces of pictures and their names are "image_D00000001.png, image_D00000002.png ... image_D00015172.png" ).

All pictures I put into are gray scale.

I only changed line 37 in hands2017.py, which is testingImageDir = "my own data path".

But I got the error below：

(venv) D:\Hand\A2J\src>python hands2017.py
1264it [03:38, 9.41it/s]Traceback (most recent call last):
File "hands2017.py", line 220, in
main()
File "hands2017.py", line 161, in main
for i, (img, label) in tqdm(enumerate(test_dataloaders)):
File "D:\Hand\venv\lib\site-packages\tqdm\std.py", line 1087, in iter
for obj in iterable:
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 568, in next
return self._process_next_batch(batch)
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\A2J\src\hands2017.py", line 133, in getitem
depth = Image.open(self.ImgDir + 'image_D%08d' % (index+1) + '.png')
File "D:\Hand\venv\lib\site-packages\PIL\Image.py", line 2766, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'D:/Hand/data/gray2/image_D00015173.png'

1264it [03:38, 5.79it/s]

Did I do anything wrong?

centre_pixel

Hello! Thanks for your code!
I have a question that how to compute centre_pixel( icvl_center_test.mat ) in ICVL dataset? I have tried to average every 16 joints but failed to generate the same data as yours. Could you provide some details in this regard?
best wishes!

the result of ICVL

Hello,thanks for your code. I run the icvl.py, but the result shows that error is 105.21951, i wonder which step would be wrong for this result.

Data augmentation

Thanks for sharing the testing code. I'm trying to reproduce the training code would like to know the implementation details for data augmentation.

. Random in-plain rotation: what are the parameters used?
. Random scaling for both in-plain and depth dimension: is each dimension scaled independently and what are the parameters used?
. Random gaussian noise is also randomly added with the probability of 0.5: which dimensions is noise added to and what are the parameters used?
. For each image in the training set, is data augmentation performed 3 times (rotation, scaling, adding noise)?
. Does data augmentation increase the number of samples in the training set (i.e. both original images and augmented images are used)?

Motivation of computing depth normal for ITOP and K2HPD datasets

Hi @zhangboshen , I noticed that you precomputed the depth normal for human pose estimation tasks, while you use original depth maps for hand pose estimation. What's the motivation for this? Does original depth not work for human pose estimation? Thanks!

Please add a requirements.txt file

create dataset.py file not found

Hello, thanks a lot for your code. However, I can't find the file to generate the dataset into .mat file.

When the code coming?

Result visualization

Hello,
Is there any way to visualize the results like the images shown in your paper?
Thanks

the meaning of the shape of tensor

Hi @zhangboshen, thanks for the good work!

May I ask what is the meaning of the function 'shift' https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L26?

What is the meaning of N, A and P in N*(whA)*P
in https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L67

why the softmax is on classfication head output instead of anchors? (https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L121)

What is the meaning and use of the function 'post-process'? https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L44

Thanks!

Center file computation of your paper

May I know how to calculate the center files of the datasets you are using? I would like to do the same process to calculate for the BigHand2.2m dataset since it is not available. If you do have it, please let me know.
You report that you use precomputed centers from Gyeongsik et al. but when checking their values with yours, the x and y components are different (eg. 58.026760101318 -32.749855041504 404.03219604492 is the first center according to them while yours is 384.1734, 206.7794 and 404.0322)

Can you provide me .h5 files?

There is no .h5 files for converting images into mat format of the files !
Can i have it?
line13 in datapreprocesses.py-(/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_depth_map.h5)
line 14 in datapreprocesses.py-("/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_labels.h5") THis files are used for converting the images of png or jpg format to the mat files

Can this model work when I don't give the bndbox?

I am trying to use this model to Identify my pictures, if the bndbox is need?

Problem while retraining A2J on NYU

Hello,
My appreciations to your great work! While trying to reproduce A2J by retraining on NYU, I am getting "inf" for regression loss. My environment is :
Windows10, cuda10.1, cudnn7.6.1, pytorch1.5.1.
I am using the same hyperparameters as used in nyu.py. It seems that you are using pretrained Resnet50 in a finetunning mode with none of its weights freezed, I am right? Your help is extremly appreciated!

Can't execute the training code

Thank you for sharing your project code. It is really helpful to understand your paper.
BTW, I succeeded in running the inference code src/nyu.py properly
but I got some errors as follows when I ran the training code src_train/nyu.py in the commit 6ab7346.

Traceback (most recent call last):
  File "nyu.py", line 10, in <module>
    import model as model
  File "D:\A2J\src_train\model.py", line 3, in <module>
    import resnet
ModuleNotFoundError: No module named 'resnet'

I fixed this error by importing torchvision.models instead of resnet in model.py.

# import resnet
import torchvision.models as models

I also fixed the line 150 to use torchvision.models according to the modification above.

        # modelPreTrain50 = resnet.resnet50(pretrained=True)
        modelPreTrain50 = models.resnet50(pretrained=True)

With all the modification above, however, I still got another error in anchor.py as follows.

Traceback (most recent call last):
  File "nyu.py", line 449, in <module>
    train()
  File "nyu.py", line 279, in train
    Cls_loss, Reg_loss = criterion(heads, label)
  File "D:\A2J\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\A2J\src_train\anchor.py", line 134, in forward
    reg = torch.unsqueeze(anchor,1) + regression #(w*h*A)*P*2
RuntimeError: The size of tensor a (1936) must match the size of tensor b (576) at non-singleton dimension 0

Could you please investigate the root cause of these errors? Did I miss something?

My environments are:
operating oystem

Microsoft Windows 10

programming language

Python 3.7.6
PyTorch 1.4.0

你好 K2hpd数据

在src/K2hpd.py 19行里面
keypointsNumber = 19
但是 k2hpd的 ground_truth关键点是只有15的啊？

How to obtain the center point coordinates and depth values during NYU inference

I'm appreciating your great work.

when i test own depth data, I can't get the center coordinates and depth values of hand bbox advance. Therefore, when I try to remove "- center [index] [0] [2]" during training, the loss cannot converge, Can you provide some help for me to adjust the parameters so that the network can converge, thanks !

spell mistake "data_preprosess.mat"

I think right spell is “data_preprocess.mat”

mistake file name:
/data/icvl/data_preprosess.mat
/data/itop_side/data_preprosess.py
/data/k2hpd/data_preprosess.py
/data/nyu/data_preprosess.mat

self.thres is unused in anchors.py

The self.thres variable (present in both A2J_loss and post_process nn.Modules) is given a default value, yet never gets used. What is its purpose?

zhangboshen / a2j Goto Github PK

a2j's Introduction

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

Introduction

Update (2021-9-28)

Update (2020-6-16)

Update (2020-3-23)

Comparison with state-of-the-art methods

A2J achieves 2nd place in HANDS2019 3D hand pose estimation Challenge

Task 1: Depth-Based 3D Hand Pose Estimation

Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects

About our code

Dependencies

code

Qualitative Results

NYU hand pose dataset:

ITOP body pose dataset:

a2j's People

Contributors

Stargazers

Watchers

Forkers

a2j's Issues

Hello! Your work is nice and I try to use your code to test my own pictures,but I get some problems.

I downloaded the pretrained model and try to test with hands2017.py with my own pictures(15172 pieces of pictures and their names are "image_D00000001.png, image_D00000002.png ... image_D00015172.png" ).

All pictures I put into are gray scale.

I only changed line 37 in hands2017.py, which is testingImageDir = "my own data path".

But I got the error below：

Did I do anything wrong?

Recommend Projects

Recommend Topics

Recommend Org