Comments (12)
In training process, the CURRENT Best EPOCH is based on the [email protected] * BBOX of validation dataset.
And the inference code and PCK calculating code for both validation dataset and test dataset are same. Your log seems a little strange, cause in my experiments the validation PCK and test PCK are quite close.
I am not sure how you crop the Panoptic dataset and how your split your training/dev/test, now I release the preprocessed data in my experiment, you can download it from here to check whether it solves your problems. But please DO NOT duplicate it for any commercial purposes, and the copyright still belongs to Panoptic.
from nsrmhand.
Thank you for reply. Test your project in OneHand10k, acquire the same accuracy with paper.But, Severe vibration of keypoint and lower with confidence in Actual video 1280*720 shape. I already modify your model by hourglass and add center cycle detect and other trick.Get a more stable model.About Limb mask trick, thoughts like this paper Multi-Scale Structure-Aware Network for Human Pose Estimation
from nsrmhand.
Since our model is trained only on image with size 368 * 368, it may not work very well on high resolution image. Thank you for adapting our model with Hourglass, and wish you can share your results.
Besides, thank you for sharing this paper, and I will read it recently. Actually, our limb mask idea is original from the Part Affinity Field of Openpose PAF. The idea of limb representation is very common in pose estimation problem, and there are lot of papers talking about it.
from nsrmhand.
I am glad to share my code. Currently, we are annotation data to train our model and modify part of finger detect.After publishing paper, I will discuss with the mentor to open source.Thank you for share project again.
from nsrmhand.
I am glad to share my code. Currently, we are annotation data to train our model and modify part of finger detect.After publishing paper, I will discuss with the mentor to open source.Thank you for share project again.
Look forward to your paper and code. And I really appreciate it if you could cite my paper. Thank you!
from nsrmhand.
Of course, this project helped me a lot.
from nsrmhand.
It does the padding the hand with 2.2B size of Panoptic data in training, but it only gets good results in panoptic dataset while it gets worse results when testing other imageset, such as onehand10k or self-photoing pictures. I think the padding size impacts greatly.
Another question, If we train the hand keypoint with the merge dataset of Panoptic and onehand10k or other hans with different padding size, can we get the better results when testing the hand with different padding size ?
from nsrmhand.
It does the padding the hand with 2.2B size of Panoptic data in training, but it only gets good results in panoptic dataset while it gets worse results when testing other imageset, such as onehand10k or self-photoing pictures. I think the padding size impacts greatly.
Another question, If we train the hand keypoint with the merge dataset of Panoptic and onehand10k or other hans with different padding size, can we get the better results when testing the hand with different padding size ?
Yes, you are correct. The padding size may impact a lot. As the model I released is just trained on the preprocessed Panoptic dataset. Thus it may only work well on the fixed 2.2B bounding box. Besides, the Panoptic(P) dataset has a totally different distribution with the Onehand10K(O), like the background of P is just the lab, while the background of O is the wild. Thus I think it's unfair to test the model trained with P on O.
For the second question, it may work well, as the hands in O can take any percent area of the image. For this, you may need to adjust the hyperparameters sigma of LPM and width of LDM, to make them consistent with the hand size.
By the way, the goal of this paper is just to improve performance with algorithm, not to build a general hand pose estimation system working on all scenes :)
from nsrmhand.
We can see that invisible keypoints are unannotated in dataset O, as the values =-1 in the label. If we want to train O, should we modify the HandDataset_LPM class in the hand_lpm.py ? and are keypoint groudtruth values still written as -1 in the labels.json?
from nsrmhand.
We can see that invisible keypoints are unannotated in dataset O, as the values =-1 in the label. If we want to train O, should we modify the HandDataset_LPM class in the hand_lpm.py ? and are keypoint groudtruth values still written as -1 in the labels.json?
For this issue, you can make the heatmap all zeros in training. When evaluating the PCK, if label -1, you can just ignore it.
Thus you may need to modify the data loader function.
from nsrmhand.
Can you share us the data loader function or other modified function when training O dataset?
There is no tricks in this function, and I already said very clear that you just need to set all the heatmaps zero for invisible keypoints. That is really simple to code it by yourself, just two lines code ... I believe you can do it within a few seconds :)
For the code in data loader, just
def gen_label_heatmap(self, label):
label = torch.Tensor(label) # (21,2)
grid = torch.zeros((self.label_size, self.label_size, 2)) # size:(46,46,2)
grid[..., 0] = torch.Tensor(range(self.label_size)).unsqueeze(0)
grid[..., 1] = torch.Tensor(range(self.label_size)).unsqueeze(1)
grid = grid.unsqueeze(0)
labels = label.unsqueeze(-2).unsqueeze(-2)
exponent = torch.sum((grid - labels)**2, dim=-1) # size:(21,46,46)
heatmaps = torch.exp(-exponent / 2.0 / self.sigma / self.sigma) # size:(21,46,46)
# Here is the only different *******************************
invisible = (label[:, 0] == -1) # set invisible heat maps to zero
heatmaps[invisible, ...] = 0
# **********************************************************
return heatmaps
For the sigma in the LPM, as the sizes of hands vary in the Onehand10K, thus I set it as 0.03 of the bounding box size in the input image scale (368 * 368). You can adjust it by your self to get a better results. I just set it casually ... :)
from nsrmhand.
Can you share us the data loader function or other modified function when training O dataset?
By the way, its better to start a new issue for this if you still have questions, rather than discussing a lot in other people's issue which is not relevant to your question.
I found that the owner of this issue close it just now. I hope our discussion does not bother him or her :)
from nsrmhand.
Related Issues (20)
- hand tightest bounding box? HOT 1
- CMU Panoptic file HOT 5
- What do mask1, mask2 and mask3 mean? HOT 1
- predict two hands from single picture HOT 5
- The results of other test images is error HOT 1
- Encounter segfault when running inference.py HOT 2
- ./CPM used in training? HOT 1
- src.augmentation missing
- model structure in code incompatible with in paper HOT 5
- Required dataset HOT 1
- Question about LM loss HOT 1
- Can the model predict hand boundary boxes on its own? HOT 2
- Can you provide the OneHand 10k dataset? HOT 1
- How to run the OneHand10K dataset
- demo output HOT 10
- mediapipe hand landmark detect HOT 1
- loss function HOT 1
- How to use NSRM in HR-Net HOT 6
- How to get 21 keypoints annotations after cropping the original image? HOT 2
- missing limb structure problems HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nsrmhand.