Comments (9)
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.
Your reply has saved me a lot of time. I originally intended to use a depth camera for real-time inference, but now it seems unnecessary.
Its a novel idea for HPE using anchors. But I feel like there are many unreasonable designs. Firstly, the mathematical calculation of anchor coordinates plus offset multiplied by weight can almost be replaced by anchor coordinates multiplied by weight. Secondly, in depth(z_coord) calculation, the network predicts the depth value of each anchor point(111116*14), and then weighted and summed up to obtain the final keypoints depth.....Why not directly use the depth corresponding to the anchor point to weight and sum up?
from a2j.
Can you provide the training code for ITOP data set?Just like the NYU dataset is provided.
from a2j.
@logic03, Sorry that ITOP training code is kind of in a mess and requires efforts to reorganize, but most of the ITOP training details are similar to nyu code, except that we use bndbox instead of center points.
And for the poor performance on your data, my guess is also the MEAN and STD in your pics are very different with ITOP dataset., maybe you can train your own model.
from a2j.
Hi @logic03
Were you able to utilize this model to predict the Joints for a custom dataset?
I'm also trying to pass a depth frame along with the ITOP side dataset and change the mean value so that the input depth frame to the model matches with the ITOP_side dataset. Unfortunately, the results are very bad.
Could you tell me if you were able to do something more on this?
from a2j.
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.
from a2j.
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.Your reply has saved me a lot of time. I originally intended to use a depth camera for real-time inference, but now it seems unnecessary.
Its a novel idea for HPE using anchors. But I feel like there are many unreasonable designs. Firstly, the mathematical calculation of anchor coordinates plus offset multiplied by weight can almost be replaced by anchor coordinates multiplied by weight. Secondly, in depth(z_coord) calculation, the network predicts the depth value of each anchor point(11_11_16*14), and then weighted and summed up to obtain the final keypoints depth.....Why not directly use the depth corresponding to the anchor point to weight and sum up?
Hi, have you used other models to successfully do real-time reasoning?
from a2j.
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.Your reply has saved me a lot of time. I originally intended to use a depth camera for real-time inference, but now it seems unnecessary.
Its a novel idea for HPE using anchors. But I feel like there are many unreasonable designs. Firstly, the mathematical calculation of anchor coordinates plus offset multiplied by weight can almost be replaced by anchor coordinates multiplied by weight. Secondly, in depth(z_coord) calculation, the network predicts the depth value of each anchor point(11_11_16*14), and then weighted and summed up to obtain the final keypoints depth.....Why not directly use the depth corresponding to the anchor point to weight and sum up?Hi, have you used other models to successfully do real-time reasoning?
Currently, there are few depth-based algorithm, but there are many rgb-based 3D HPE algorithm,such as RLE、Poseformer、motionbert and so on.
from a2j.
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.Your reply has saved me a lot of time. I originally intended to use a depth camera for real-time inference, but now it seems unnecessary.
Its a novel idea for HPE using anchors. But I feel like there are many unreasonable designs. Firstly, the mathematical calculation of anchor coordinates plus offset multiplied by weight can almost be replaced by anchor coordinates multiplied by weight. Secondly, in depth(z_coord) calculation, the network predicts the depth value of each anchor point(11_11_16*14), and then weighted and summed up to obtain the final keypoints depth.....Why not directly use the depth corresponding to the anchor point to weight and sum up?Hi, have you used other models to successfully do real-time reasoning?
Currently, there are few depth-based algorithm, but there are many rgb-based 3D HPE algorithm,such as RLE、Poseformer、motionbert and so on.目前基于深度的算法很少,但基于rgb的3D HPE算法很多,如RLE、Poseformer、motionbert等。
Thanks for your reply,it help me a lot . one more question, do you know RGBD-based algorithm can use to do real-time inference.
from a2j.
I tried a long time to predict the keypoints on a custom dataset, cleaned up the code, modified my depth images to be as close to the ITOP_side ones as possible, but all the results are pretty much garbage. My guess is that the model overfitted the ITOP data. Did anyone train a more general model on multiple datasets? As far as I can see were none of the models tested on different datasets than their training set. Different data yes, but not on different datasets.
Additionally I cannot get the speeds praised by the original paper. I can get to round about 10 iterations / second and that is on a better GPU with my fully optimized torch only code with better data-loading and no output... With e.g. yolo-v3 prediction human bboxes I get not much more than 8 iterations / sec.Your reply has saved me a lot of time. I originally intended to use a depth camera for real-time inference, but now it seems unnecessary.
Its a novel idea for HPE using anchors. But I feel like there are many unreasonable designs. Firstly, the mathematical calculation of anchor coordinates plus offset multiplied by weight can almost be replaced by anchor coordinates multiplied by weight. Secondly, in depth(z_coord) calculation, the network predicts the depth value of each anchor point(11_11_16*14), and then weighted and summed up to obtain the final keypoints depth.....Why not directly use the depth corresponding to the anchor point to weight and sum up?Hi, have you used other models to successfully do real-time reasoning?
Currently, there are few depth-based algorithm, but there are many rgb-based 3D HPE algorithm,such as RLE、Poseformer、motionbert and so on.目前基于深度的算法很少,但基于rgb的3D HPE算法很多,如RLE、Poseformer、motionbert等。
Thanks for your reply,it help me a lot . one more question, do you know RGBD-based algorithm can use to do real-time inference.
Using depth map,there is little research in this area. I don't know if there is an RGBD-based algorithm that can achieve both fast and accurate results. From the results of my paper search, it appears that A2J is the algorithm closest to your requirements
from a2j.
Related Issues (20)
- How to obtain the center point coordinates and depth values during NYU inference HOT 4
- an you provide the mat file of the detection bounding boxes of the itop side and top training set
- Can you provide the mat file of the detection bounding boxes of the itop side and top training set HOT 2
- Problem while retraining A2J on NYU HOT 3
- Please add a requirements.txt file HOT 1
- 好奇下载的网络有没finetuning过 HOT 2
- Swapping Width and Height dimensions HOT 1
- Selecting anchor points with P = 0.02 HOT 2
- what's the mean of "depthFactor"? HOT 4
- Input Files for the ITOP dataset HOT 1
- Hands2017的数据是基于绝对3D坐标的,你们训练是基于UVD的, 请问Hands2017是如何训练的呢? HOT 6
- 训练ITOP_side数据集 HOT 1
- Unable to reproduce the results for the ITOP side view human body dataset. HOT 1
- reason for num_channel expansion? HOT 3
- training with missing keypoints? HOT 3
- How can I convert pre-trained model to coreml to use it on ios application?
- Why are the Anchor-Points generated on a grid with spaces between them? HOT 1
- Thank you so much!
- Hi, mat files are generated using this script: https://github.com/zhangboshen/A2J/blob/master/data/icvl/data_preprosess.m HOT 1
- Enquiry about drawing human 3D pose on our one depth image HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from a2j.