Comments (57)
To be honest, the segmentation accuracy is bad and it limits our performance.
The pose estimation accuracy mainly benefits from the vector representation for object keypoints.
However,why is the vector representation for object keypoints only used during testing rather than training?
from pvnet.
Sorry for late reply.
Our method is a deep learning method, which is good at exploiting global context for detecting keypoints, while feature-based matching methods typically only use the local features.
from pvnet.
mask-rcnn architecture is an alternative.
It is more compact to predict semantic labels and vectors simultaneously
from pvnet.
mask-rcnn architecture is an alternative.
It is more compact to predict semantic labels and vectors simultaneously
What is the accuracy of the segmentation?good or bad?
from pvnet.
To be honest, the segmentation accuracy is bad and it limits our performance.
The pose estimation accuracy mainly benefits from the vector representation for object keypoints.
from pvnet.
Hough voting is a procedure without parameters to be learned, so we only need to train the network to output the vector field.
from pvnet.
I found that your LINEMOD-phone dataset(number:1225) is smaller than the LINEMOD_ORIG phone dataset(number:1243). Why?
from pvnet.
I did not notice this problem.
The LINEMOD dataset we use is provided by https://github.com/Microsoft/singleshotpose
from pvnet.
“val epoch 200 step 200 seg 0.00217727 ver 0.00254667 precision 0.95366901 recall 0.96332896 ”
how do i get 2Dprojection and ADD(-S)?
from pvnet.
python tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls cat --test_model
from pvnet.
About draw_utils.py, I found that the input to some of its functions requires special handling. Could you call them to write in demo.py? I believe this will make it easier for others to understand the paper.
from pvnet.
“mask_pred = torch.argmax(seg_pred, 1)
visualize_mask(mask_pred.detach().cpu().numpy(), mask.detach().cpu().numpy(), save=False, save_fn=None)”
I can only implement the display of the mask.
from pvnet.
OK, I will add visualize_mask
, visualize_vertex
, visualize_hypothesis
and visualize_voting_ellipse
recently.
from pvnet.
I upload a jupyter notebook to visualize the keypoint detection pipeline: https://github.com/zju3dv/pvnet#visualization-of-the-voting-procedure
from pvnet.
Is this key point detection method based on Hough voting affected by texture information?
from pvnet.
No, if compared with other deep learning methods.
from pvnet.
About ransac_voting_layer_v3 in ransac_voting_gpu.py,I don't understand the principle very much,why can it get hypothetical points from existing coordinates? does especially ransac_voting, ransac_voting.generate_hypothesis have python version?Because C++ version is not easy to debug.
from pvnet.
We generate hypotheses as described in our paper.
ransac_voting.generate_hypothesis
do not have python version.
from pvnet.
I have interest in Real-time-demo in th Project Page,how does it realise that the cat interacts with the hat?Is the hat's pose the same as the cat?and is the hat's model pose the same as the cat's model pose?
from pvnet.
The hat's pose is the same as the cat.
- I imported the cat model and hat model in blender.
- At first, the hat model is not on the cat's head. To solve this problem, I manually moved it to the head position and save it as a new model.
- Now the hat is on the cat's head in the 3D space, resulting in the interaction effect given the same 6D pose.
from pvnet.
About the Real-time-demo in th Project Page, i use 3DXchange to build hat and cat model but cann't moved hat to the cat head position and save them as a new model.What is your software used? I'm sure that i need your help.
from pvnet.
I used blender.
from pvnet.
Well, how is the spatial coordinate of the model set? Is it relative to the camera coordinate system or the world coordinate system? How is the world coordinate system determined?
from pvnet.
The world coordinate system of blender.
I do not manually adjust the world coordinate system.
from pvnet.
If the projection of the key points on the surface of the object is occluded, will the projection of the key points at the occlusion position be voted out at this time? If not, how do you perform PnP calculations without the projection of key points?
My understanding of voting is to find the voting results in all known foreground coordinates, rather than finding the projected coordinates of the key points in the unknown space.
from pvnet.
The motivation of our paper is to handle invisible keypoints.
from pvnet.
“My understanding of voting is to find the voting results in all known foreground coordinates, rather than finding the projected coordinates of the key points in the unknown space.”So is my understanding correct?
from pvnet.
We find both known and unknown coordinates.
from pvnet.
What does the indicator function II in Equation 2 specifically mean?I cann't find in this paper.
from pvnet.
In this paper?
from pvnet.
Of course!as follow:
where II represents the indicator function, � is a threshold (0.99 in all experiments), and p 2 O means that the pixel p belongs to the object O. Intuitively, a higher voting scoremeans that a hypothesis is more confident as it coincides with more predicted directions.
from pvnet.
Indicator function means 1 if satisfy the condition otherwise 0.
from pvnet.
In ransac_voting_kernel.co---generate_hypothesis_kernel(),why is the Initialization of 'hvi' threadIdx.x + blockIdx.x*blockDim.x?I can't understand it!
from pvnet.
It's CUDA programming.
I first define the GPU layout by getGPULayout(hn*vn,1,1,&bdim0,&bdim1,&bdim2,&tdim0,&tdim1,&tdim2);
.
Then I can get the hvi
-th thread by threadIdx.x + blockIdx.x*blockDim.x
.
from pvnet.
if(fabs(nx1*ny0-nx0*ny1)<1e-6) return;
if(fabs(ny1*nx0-ny0*nx1)<1e-6) return;
Why do you want to return if two condition are met?
float y=(nx1*(nx0*cx0+ny0*cy0)-nx0*(nx1*cx1+ny1*cy1))/(nx1*ny0-nx0*ny1);
float x=(ny1*(nx0*cx0+ny0*cy0)-ny0*(nx1*cx1+ny1*cy1))/(ny1*nx0-ny0*nx1);
What is the prototype of this formula?
from pvnet.
We represent a line in the Hessian Normal Form.
Then compute the intersection of two lines.
from pvnet.
In the paper > This step is repeated N times to generate a set of hypotheses
So i think if there are n pixels of the target object,N=n!/(2!(n-2)!),which means that n points combine with each other.But i find hvi <hnvn,it is different with mine.Does it mean that you randomly take hnvn hypothetical points rather than n!/(2!(n-2)!)?
from pvnet.
Yes
from pvnet.
nx0 = direct[t0 * vn * 2 + vi * 2 + 1]
ny0 = -direct[t0 * vn * 2 + vi * 2]
n should be unit vector of pixel P and 2D keypoint Xk rather than normal vector,but n should be normal vector in the Hessian Normal Form.I'm so confused!
from pvnet.
nx0 = direct[t0 * vn * 2 + vi * 2 + 1]
ny0 = -direct[t0 * vn * 2 + vi * 2]
This gives a normal vector.
from pvnet.
How is the dense_pts.txt in your dataset sampled? I compared it to object.xyz in the LINEMOD_ORIG and found that the two are not the same.
from pvnet.
Use CloudCompare.
from pvnet.
What format files can we get with CloudCompare for our use
from pvnet.
What do you mean by 'what format files'
from pvnet.
For example: *. txt, *. ply or other coordinate files?
from pvnet.
They are all available for CloudCompare.
from pvnet.
I have been thinking about it recently: If the object is missing texture, will the method based on the key point method be affected? What is the difference between it and the feature-based matching method?
from pvnet.
Hi, I wonder how can I train my own object. do I need a 3D object + its texture + mask image and so on? If I can't is there any other method to reuse pretrained model for test quickly?
from pvnet.
You need a 3D object and images with ground truth poses.
I am not sure if the pretrained model can generalize to new object.
from pvnet.
In the ransac_voting_layer_v3.py function,
Coords = torch.nonzero(cur_mask).float()
Coords = coords[:, [1, 0]]
Why do you want to reverse the x, y coordinates?
from pvnet.
coords is [row, col].
Reverse [row, col] to get [x, y].
from pvnet.
I want to implement the visual code shown in this figure, but without success, can you upload this part of the code in demo.py?
from pvnet.
for ci,corner in enumerate(corners):
dir_img=img.copy()
dir_img[mask>0]//=2
dir_img[mask>0]+=np.asarray([255,255,255],np.uint8)//2
plt.imshow(dir_img)
for hi in range(h):
for wi in range(w):
if mask[hi,wi]==0: continue
if hi%5==0 and wi%5==0:
diff=np.asarray(corner)-np.asarray([wi,hi])
diff/=np.linalg.norm(diff)
diff*=7
# plt.arrow(wi,hi,diff[0]*5,diff[1]*5,width=0.5,linewidth=0.5)
plt.annotate("",xy=(wi+diff[0],hi+diff[1]),xytext=(wi,hi),
arrowprops={'arrowstyle':'->,head_length=0.3,head_width=0.3','color':'red'})
plt.show()
from pvnet.
This idea is really amazing!Thank you~
from pvnet.
When testing the Occlusion-Linemod dataset, is it also the same model trained using the Linemod dataset?
from pvnet.
Yes.
from pvnet.
In the paper > This step is repeated N times to generate a set of hypotheses So i think if there are n pixels of the target object,N=n!/(2!(n-2)!),which means that n points combine with each other.But i find hvi <hn_vn,it is different with mine.Does it mean that you randomly take hn_vn hypothetical points rather than n!/(2!(n-2)!)?
您好,
1、请问您了解这里选取两个向量求交点,是任意选取的两个向量吗?
2、if(fabs(ny1nx0-ny0nx1)<1e-6) return;这里是不是定了如果两条直线的斜率相近的话,就不取这两条直线求交点呢?
from pvnet.
Related Issues (20)
- 环境问题 HOT 1
- lib中文件缺失 HOT 1
- 关于compute_vertex()函数中的问题请教 HOT 1
- Question about the download link for datasets HOT 1
- Invalid datasets link HOT 1
- 关于训练的物体泛化性的疑惑 HOT 1
- NameError: name 'Resnet18_8s' is not defined
- File "/home/mona/anaconda3/envs/pvnet/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension HOT 1
- Can not download the datasets HOT 1
- error with build_ceres.sh HOT 1
- cat demo
- replace torch.gesv with torch.linalg.solve
- visualize_voting_ellipse fails in visualization.ipynb HOT 1
- import lib.ransac_voting_gpu_layer.ransac_voting as ransac_voting ImportError: /home/mona/pvnet/lib/ransac_voting_gpu_layer/ransac_voting.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail19maybe_wrap_dim_slowEllb
- 如果我想针对PVNet BN层进行剪枝,我应该怎么做
- conda环境安装了transforms3d但是运行脚本时出现No module错误
- can i run the pvnet on a windows evironemnt of to google colab
- about vector compute
- How to get keypoint uncertainty
- the pretrained models cannot be downloaded
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pvnet.