zhangboshen / a2j Goto Github PK
View Code? Open in Web Editor NEWCode for paper "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image". ICCV2019
License: MIT License
Code for paper "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image". ICCV2019
License: MIT License
output are:-python itop_side.py
406it [00:27, 15.02it/s]
('Accuracy:', 0)
('joint_', 0, 'Head', ', accuracy: ', 0)
('joint_', 1, 'Neck', ', accuracy: ', 0)
('joint_', 2, 'RShoulder', ', accuracy: ', 0)
('joint_', 3, 'LShoulder', ', accuracy: ', 0)
('joint_', 4, 'RElbow', ', accuracy: ', 0)
('joint_', 5, 'LElbow', ', accuracy: ', 0)
('joint_', 6, 'RHand', ', accuracy: ', 0)
('joint_', 7, 'LHand', ', accuracy: ', 0)
('joint_', 8, 'Torso', ', accuracy: ', 0)
('joint_', 9, 'RHip', ', accuracy: ', 0)
('joint_', 10, 'LHip', ', accuracy: ', 0)
('joint_', 11, 'RKnee', ', accuracy: ', 0)
('joint_', 12, 'LKnee', ', accuracy: ', 0)
('joint_', 13, 'RFoot', ', accuracy: ', 0)
('joint_', 14, 'LFoot', ', accuracy: ', 0)
I couldnt get the output of images in matplot graph?
MAy know if anyone isssues
请问大佬 这个可以做到real-time吗,在实际场景用是不是还要多加一个检测算Bndbox呢, 如果直接加载k2hpd的Bndbox
效果可以吗
There is no .h5 files for converting images into mat format of the files !
Can i have it?
line13 in datapreprocesses.py-(/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_depth_map.h5)
line 14 in datapreprocesses.py-("/dataspace/zhangboshen/ITOP_LSTM/ITOP_NewBaseline/data/top_train/ITOP_top_train_labels.h5") THis files are used for converting the images of png or jpg format to the mat files
Thanks for sharing your project. Is there a demo script you can share to test it live with a depth camera?
Hi, thank you for your great work.
I'm intending to train this model on my own dataset. Will you release the training code?
Thank you very much!
Hello!
You're work is very interesting but i have a question that how to make gt for informative anchor proposal branch and depth estimation branch?
@zhangboshen , great impressive work! Thanks for sharing!
In your readme.md, you mentioned the following,
"After downloaded these datasets, you can follow the code from data folder (data_preprosess.py) to convert ICVL, NYU, ITOP, and K2HPD images to .mat files."
But I could not find it. Do you really have that file and need it?
Hello, thanks for your interesting work! I wonder how to compute mean/std in the data folder, i.e. are they some statistics on training set? Looking forward to your reply!
Thanks for sharing the testing code. I'm trying to reproduce the training code would like to know the implementation details for data augmentation.
. Random in-plain rotation: what are the parameters used?
. Random scaling for both in-plain and depth dimension: is each dimension scaled independently and what are the parameters used?
. Random gaussian noise is also randomly added with the probability of 0.5: which dimensions is noise added to and what are the parameters used?
. For each image in the training set, is data augmentation performed 3 times (rotation, scaling, adding noise)?
. Does data augmentation increase the number of samples in the training set (i.e. both original images and augmented images are used)?
Hi, @zhangboshen ! Thank you for this work!
When I go through scr_train/nyu.py, I notice there is a parameter called RandomOffsetDepth
.
Line 153 in 5848571
Hello! Thanks for your code!
I have a question that how to compute centre_pixel( icvl_center_test.mat ) in ICVL dataset? I have tried to average every 16 joints but failed to generate the same data as yours. Could you provide some details in this regard?
best wishes!
(venv) D:\Hand\A2J\src>python hands2017.py
1264it [03:38, 9.41it/s]Traceback (most recent call last):
File "hands2017.py", line 220, in
main()
File "hands2017.py", line 161, in main
for i, (img, label) in tqdm(enumerate(test_dataloaders)):
File "D:\Hand\venv\lib\site-packages\tqdm\std.py", line 1087, in iter
for obj in iterable:
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 568, in next
return self._process_next_batch(batch)
File "D:\Hand\venv\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\venv\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "D:\Hand\A2J\src\hands2017.py", line 133, in getitem
depth = Image.open(self.ImgDir + 'image_D%08d' % (index+1) + '.png')
File "D:\Hand\venv\lib\site-packages\PIL\Image.py", line 2766, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'D:/Hand/data/gray2/image_D00015173.png'
1264it [03:38, 5.79it/s]
Hello,
My appreciations to your great work! While trying to reproduce A2J by retraining on NYU, I am getting "inf" for regression loss. My environment is :
Windows10, cuda10.1, cudnn7.6.1, pytorch1.5.1.
I am using the same hyperparameters as used in nyu.py. It seems that you are using pretrained Resnet50 in a finetunning mode with none of its weights freezed, I am right? Your help is extremly appreciated!
Hello, thanks for your code. Can you provide the code for this chanllenge or method in detail?
Best
Weiguo Zhou
Thank you very much for your excellent code, but I would like to ask you a few questions:
I really want to recover your excellent work completely, so I also want to know the following questions;If I want to train the model of k2hpd by myself, can you refine the work in detail? I found the training code in the src_train folder, but I don't know if this code can be used to train [k2hpd]?thank you!
Can you provide Visualization code about MSRA/ICVL/NYU dataset? How to draw 3D hand joint?
Hello, thanks for your code. But when i run icvl.py, i wonder what does
''validIndex_train = np.load('/data/zhangboshen/CODE/Anchor_Pose_fpn/data/ICVL/validIndex.npy')
TrainImgFrames = len(validIndex_train)''
stand for, and i didn't find validIndex.npy in the data file you provided. I wonder what should i take care of if i write the train code, i think validIndex.npy has something to do with the train code. Thank you.
Hello!
You're work is very interesting . when the code will be available online?
Thank you!!
I want to test your model on my videos. Do you have code for inference so that I can test your work out? If not, how can I do it myself. Please guide me. I am a newbie. So, I am having a hard time understanding how specific things are working.
Hi,I used itop_side.py to predict the position of the keypoints and found that the position of the keypoints in the variable pred_keypoints was much different from that in the tag, but the accuracy of the final model was very high. Why?
I think right spell is “data_preprocess.mat”
mistake file name:
/data/icvl/data_preprosess.mat
/data/itop_side/data_preprosess.py
/data/k2hpd/data_preprosess.py
/data/nyu/data_preprosess.mat
hello, i am interested in your work. I wonder the training time of your network. can you tell me about it ? thank you very much!
For each dataset, there are files called <dataset>_mean.npy
and <dataset>_std.npy
. For example, icvl_mean.npy
and icvl_std.npy
.
These appear to be the mean and standard deviation of something, and they are used in the preprocessing step of input images. Can anyone explain where these numbers come from and how they would affect using one of the pretrained models in a realtime situation?
hello, Can you provide the mat file of the detection bounding boxes of the itop side and top training set? look forward to your reply,thanks!
Thank you for sharing your project code. It is really helpful to understand your paper.
BTW, I succeeded in running the inference code src/nyu.py
properly
but I got some errors as follows when I ran the training code src_train/nyu.py
in the commit 6ab7346.
Traceback (most recent call last):
File "nyu.py", line 10, in <module>
import model as model
File "D:\A2J\src_train\model.py", line 3, in <module>
import resnet
ModuleNotFoundError: No module named 'resnet'
I fixed this error by importing torchvision.models
instead of resnet
in model.py
.
# import resnet
import torchvision.models as models
I also fixed the line 150 to use torchvision.models
according to the modification above.
# modelPreTrain50 = resnet.resnet50(pretrained=True)
modelPreTrain50 = models.resnet50(pretrained=True)
With all the modification above, however, I still got another error in anchor.py
as follows.
Traceback (most recent call last):
File "nyu.py", line 449, in <module>
train()
File "nyu.py", line 279, in train
Cls_loss, Reg_loss = criterion(heads, label)
File "D:\A2J\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\A2J\src_train\anchor.py", line 134, in forward
reg = torch.unsqueeze(anchor,1) + regression #(w*h*A)*P*2
RuntimeError: The size of tensor a (1936) must match the size of tensor b (576) at non-singleton dimension 0
Could you please investigate the root cause of these errors? Did I miss something?
My environments are:
operating oystem
programming language
I also found that the effect of using this model in my own pictures was very bad. I was wondering whether it was because my own depth pictures had not been standardized by mean and STD.Has anyone solved this problem?In theory, this data set should be generalizable to your own depth map using ITOP.
我也发现在自己的图片中使用这个模型的效果十分糟糕,我在想是不是因为我自己的深度图片没有经过mean和std的标准化处理。请问有人解决这个问题了吗?理论上,使用ITOP这个数据集应该是可以泛化到自己的深度图中的。
HEy good work,
1.depth = scio.loadmat(self.trainingImageDir + str(self.validIndex[index]+1) + '.mat')['img'](line 141 in icvl.py)
where is the mat file to be given in this line i couldnt see in our code?
I am trying to use this model to Identify my pictures, if the bndbox is need?
Hi Zhang, thanks for your awesome work.
What does Tji mean?
For joint j, in-plain target Tji denotes the 2D ground-truth in pixel coordinate transformed according to the cropped region.
Is Tji a gaussian heamap or circle-plate heatmap as descriped in GRMI
?
More details are welcome.
Much thanks.
Thanks for sharing your inference code. Can i get the training code?
I got the center point of gesture correction from v2v(https://github.com/mks0601/V2V-PoseNet_RELEASE), but how does the data correspond to the file? Which file does each line of 3D data point correspond to?
Hello, thanks a lot for your code. However, I can't find the file to generate the dataset into .mat file.
May I know how to calculate the center files of the datasets you are using? I would like to do the same process to calculate for the BigHand2.2m dataset since it is not available. If you do have it, please let me know.
You report that you use precomputed centers from Gyeongsik et al. but when checking their values with yours, the x and y components are different (eg. 58.026760101318 -32.749855041504 404.03219604492 is the first center according to them while yours is 384.1734, 206.7794 and 404.0322)
Hi @zhangboshen , I noticed that you precomputed the depth normal for human pose estimation tasks, while you use original depth maps for hand pose estimation. What's the motivation for this? Does original depth not work for human pose estimation? Thanks!
Hi I would like to understand how the mean and std for the images are obtained. Would you be able to provide the code? Thanks.
I try to use the itop_side.pth to identify my pictures, but it's a little poor.
So I want to use my pictures to train this model, how can I do this?
Hello,
Is there any way to visualize the results like the images shown in your paper?
Thanks
In itop_top.py there is no file ITOP_top.pth in the repository can help me providing this file
hello, Can you provide the mat file of the detection bounding boxes of the itop side and top training set? look forward to your reply,thanks!
Hi @zhangboshen .
I'm trying to train A2J model on ITOP dataset. Since the bounding box file for the training set is not provided, could you please tell me how to obtain bounding boxes? Many thanks!
Hi @zhangboshen, thanks for the good work!
May I ask what is the meaning of the function 'shift' https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L26?
What is the meaning of N, A and P in N*(whA)*P
in https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L67
why the softmax is on classfication head output instead of anchors? (https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L121)
What is the meaning and use of the function 'post-process'? https://github.com/zhangboshen/A2J/blob/master/src_train/anchor.py#L44
Thanks!
I'm appreciating your great work.
when i test own depth data, I can't get the center coordinates and depth values of hand bbox advance. Therefore, when I try to remove "- center [index] [0] [2]" during training, the loss cannot converge, Can you provide some help for me to adjust the parameters so that the network can converge, thanks !
Hello,thanks for your code. I run the icvl.py, but the result shows that error is 105.21951, i wonder which step would be wrong for this result.
Thanks for sharing your inference code. I was wondering if your issues previously mentioned are resolved now so you can share your training code as well, thanks!
The self.thres
variable (present in both A2J_loss
and post_process
nn.Modules) is given a default value, yet never gets used. What is its purpose?
I'm appreciating your great work.
For my own human pose dataset, in order to train the my own model:
For above questions, would you please share the method or algorithm?
在src/K2hpd.py 19行里面
keypointsNumber = 19
但是 k2hpd的 ground_truth关键点是只有15的啊?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.