Giter Club home page Giter Club logo

votenet's Introduction

Deep Hough Voting for 3D Object Detection in Point Clouds

Created by Charles R. Qi, Or Litany, Kaiming He and Leonidas Guibas from Facebook AI Research and Stanford University.

teaser

Introduction

This repository is code release for our ICCV 2019 paper (arXiv report here).

Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird’s eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles to construct a 3D detection pipeline for point cloud data and as generic as possible. However, due to the sparse nature of the data – samples from 2D manifolds in 3D space – we face a major challenge when directly predicting bounding box parameters from scene points: a 3D object centroid can be far from any surface point thus hard to regress accurately in one step. To address the challenge, we propose VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting. Our model achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency. Remarkably, VoteNet outperforms previous methods by using purely geometric information without relying on color images.

In this repository, we provide VoteNet model implementation (with Pytorch) as well as data preparation, training and evaluation scripts on SUN RGB-D and ScanNet.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{qi2019deep,
    author = {Qi, Charles R and Litany, Or and He, Kaiming and Guibas, Leonidas J},
    title = {Deep Hough Voting for 3D Object Detection in Point Clouds},
    booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
    year = {2019}
}

Installation

Install Pytorch and Tensorflow (for TensorBoard). It is required that you have access to GPUs. Matlab is required to prepare data for SUN RGB-D. The code is tested with Ubuntu 18.04, Pytorch v1.1, TensorFlow v1.14, CUDA 10.0 and cuDNN v7.4. Note: After a code update on 2/6/2020, the code is now also compatible with Pytorch v1.2+

Compile the CUDA layers for PointNet++, which we used in the backbone network:

cd pointnet2
python setup.py install

To see if the compilation is successful, try to run python models/votenet.py to see if a forward pass works.

Install the following Python dependencies (with pip install):

matplotlib
opencv-python
plyfile
'trimesh>=2.35.39,<2.35.40'
'networkx>=2.2,<2.3'

Run demo

You can download pre-trained models and sample point clouds HERE. Unzip the file under the project root path (/path/to/project/demo_files) and then run:

python demo.py

The demo uses a pre-trained model (on SUN RGB-D) to detect objects in a point cloud from an indoor room of a table and a few chairs (from SUN RGB-D val set). You can use 3D visualization software such as the MeshLab to open the dumped file under demo_files/sunrgbd_results to see the 3D detection output. Specifically, open ***_pc.ply and ***_pred_confident_nms_bbox.ply to see the input point cloud and predicted 3D bounding boxes.

You can also run the following command to use another pretrained model on a ScanNet:

python demo.py --dataset scannet --num_point 40000

Detection results will be dumped to demo_files/scannet_results.

Training and evaluating

Data preparation

For SUN RGB-D, follow the README under the sunrgbd folder.

For ScanNet, follow the README under the scannet folder.

Train and test on SUN RGB-D

To train a new VoteNet model on SUN RGB-D data (depth images):

CUDA_VISIBLE_DEVICES=0 python train.py --dataset sunrgbd --log_dir log_sunrgbd

You can use CUDA_VISIBLE_DEVICES=0,1,2 to specify which GPU(s) to use. Without specifying CUDA devices, the training will use all the available GPUs and train with data parallel (Note that due to I/O load, training speedup is not linear to the nubmer of GPUs used). Run python train.py -h to see more training options (e.g. you can also set --model boxnet to train with the baseline BoxNet model). While training you can check the log_sunrgbd/log_train.txt file on its progress, or use the TensorBoard to see loss curves.

To test the trained model with its checkpoint:

python eval.py --dataset sunrgbd --checkpoint_path log_sunrgbd/checkpoint.tar --dump_dir eval_sunrgbd --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal

Example results will be dumped in the eval_sunrgbd folder (or any other folder you specify). You can run python eval.py -h to see the full options for evaluation. After the evaluation, you can use MeshLab to visualize the predicted votes and 3D bounding boxes (select wireframe mode to view the boxes). Final evaluation results will be printed on screen and also written in the log_eval.txt file under the dump directory. In default we evaluate with both [email protected] and [email protected] with 3D IoU on oriented boxes. A properly trained VoteNet should have around 57 [email protected] and 32 [email protected].

Train and test on ScanNet

To train a VoteNet model on Scannet data (fused scan):

CUDA_VISIBLE_DEVICES=0 python train.py --dataset scannet --log_dir log_scannet --num_point 40000

To test the trained model with its checkpoint:

python eval.py --dataset scannet --checkpoint_path log_scannet/checkpoint.tar --dump_dir eval_scannet --num_point 40000 --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal

Example results will be dumped in the eval_scannet folder (or any other folder you specify). In default we evaluate with both [email protected] and [email protected] with 3D IoU on axis aligned boxes. A properly trained VoteNet should have around 58 [email protected] and 35 [email protected].

Train on your own data

[For Pro Users] If you have your own dataset with point clouds and annotated 3D bounding boxes, you can create a new dataset class and train VoteNet on your own data. To ease the proces, some tips are provided in this doc.

Acknowledgements

We want to thank Erik Wijmans for his PointNet++ implementation in Pytorch (original codebase).

License

votenet is relased under the MIT License. See the LICENSE file for more details.

Change log

10/20/2019: Fixed a bug of the 3D interpolation customized ops (corrected gradient computation). Re-training the model after the fix slightly improves mAP (less than 1 point).

votenet's People

Contributors

charlesq34 avatar ndalton12 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

votenet's Issues

Size class label and semantic class label

Hi,

According to the codes below, it seems that size regression is coupled with semantic classification. The size class label is the same as the semantic class label. I guess this is under the assumption that templates (average size) can provide a strong proposal.

I just wanna check if I misunderstand something. Please correct me if I'm wrong.

size_class, size_residual = DC.size2class(box3d_size, DC.class2type[semantic_class])

def size2class(self, size, type_name):
''' Convert 3D box size (l,w,h) to size class and size residual '''
size_class = self.type2class[type_name]
size_residual = size - self.type_mean_size[type_name]
return size_class, size_residual

Best,
Jianyuan

Boxnet getting better results than Votenet

I'm sorry that this is not a general issue but raised this so that it may help people training on their custom datasets.

I'm testing on my custom dataset and experiencing Boxnet getting better results than votenet.
To simplify the training and dataset creation process, the dataset is only single class and the heading is single class as well.

Boxnet..
eval mean box_loss: 0.143415
eval mean center_loss: 0.033522
eval mean heading_cls_loss: 0.408684
eval mean heading_reg_loss: 0.021197
eval mean loss: 1.641386
eval mean neg_ratio: 0.340674
eval mean obj_acc: 0.952075
eval mean objectness_loss: 0.041447
eval mean pos_ratio: 0.659326
eval mean sem_cls_loss: 0.000000
eval mean size_cls_loss: 0.000000
eval mean size_reg_loss: 0.047828
0 0.8460753105326817
eval person Average Precision: 0.846075
eval mAP: 0.846075
eval person Recall: 0.984816
eval AR: 0.984816

Votenet
eval mean box_loss: 0.131423
eval mean center_loss: 0.038023
eval mean heading_cls_loss: 0.539432
eval mean heading_reg_loss: 0.001078
eval mean loss: 5.734286
eval mean neg_ratio: 0.920239
eval mean obj_acc: 0.984032
eval mean objectness_loss: 0.015435
eval mean pos_ratio: 0.018091
eval mean sem_cls_loss: 0.000000
eval mean size_cls_loss: 0.000000
eval mean size_reg_loss: 0.038379
eval mean vote_loss: 0.434288
eval person Average Precision: 0.382230
eval mAP: 0.382230
eval person Recall: 0.558568
eval AR: 0.558568

Has anyone experienced similar issues or tips?

Some things I noticed:

  • the voting loss doesn't converge so well on the custom dataset.
  • the pos_ratio is quite low for votenet.

some questions about the final figure

Thanks for your excellent work! I got a very high precision than before.But I have some questions about the figure.When I finish run the eval.py ,I got some files like this:
image
and could you please tell me how can I do to get the final figure in the paper you show?thank you

Feature normalization before the Proposal Module

Hi,

I notice that you normalize the output feature from the Voting Module and then feed it into the Proposal Module. The feature vector for each vote (or point) would be a unit vector.

In my trials, it seems this normalization does not have a significant effect on the performance (mAP). I was wondering what's the objective of this operation, or any philosophy behind?

votenet/models/votenet.py

Lines 95 to 105 in 257b8d8

xyz, features = self.vgen(xyz, features)
features_norm = torch.norm(features, p=2, dim=1)
features = features.div(features_norm.unsqueeze(1))
end_points['vote_xyz'] = xyz
end_points['vote_features'] = features
end_points = self.pnet(xyz, features, end_points)
return end_points

Best,
Jianyuan

Detach gradients when evaluation to reduce memory cost

Hi Charles,

There is an evaluation process after every 10 epochs.

votenet/train.py

Lines 319 to 320 in 237f6ff

if EPOCH_CNT == 0 or EPOCH_CNT % 10 == 9: # Eval every 10 epochs
loss = evaluate_one_epoch()

I guess the gradients during evaluation are unnecessary while they take a lot of GPU memory. It may be better to detach gradients during evaluation, just as:

 if EPOCH_CNT == 0 or EPOCH_CNT % 10 == 9: # Eval every 10 epochs 
     with torch.no_grad():
          loss = evaluate_one_epoch() 

or more specifically, detach gradients in evaluate_one_epoch().

end_points = net(inputs)

     with torch.no_grad():
          end_points = net(inputs)

In my experiments, it decreases around 2300 MiB memory cost when batch size=8 (the original cost is 8877 MiB).

Best,
Jianyuan Wang

How to use votenet to detect our own .ply file?

image
excuse me, I am very sorry to trouble you again.When I use the votenet to detect my own .ply file.It always mentioned me that question?How to modify this mistake?what should I do ? thank you for your reply.

Related work

Thanks for your great job!this net is so wonderful.And I want to know , if the votenet has been well trained , can we use it to detect other poind clouds(such as we get the point cloud from the reconstruction work) , and Where should I modify the code?I look forward to your reply.Thank you.

Questions on some constants

I have a question about why you choose:
(1) 0.3&0.6 in the paper part 4.2 for pos/neg judgement
(2) radius of 0.3 for proposal module
(3) radius of 0.2/0.4/0.8/1.2 for set-abstraction in pointnet2backbone.
And how to set them,if the size of objects from my own data is very different from that of sunrgbd?

Final Images- as In Paper

Hi @charlesq34 & @orlitany

Thank you for your amazing work on this project. I am having some trouble visualizing the way depicted in the paper.

After turning into quad dominant mesh, I get something like this -

image

How shall I proceed towards visualizations as depicted in the paper ? I.E. having the hollow boxes as well as different colors for the point clouds ?

Also, how do I generate the text file that recognizes which objects are in the scene (ie chair, table,etc) and not just the bounding boxes.

Thank you in advance.

Results on ScanNet (pretrained checkpoint)

Hi authors,

Thanks a lot on your work and code for VoteNet. After installing the code, preparing the dataset according to instructions and evaluating with the provided pretrained checkpoint on ScanNet (running the script python eval.py with given flags in README.md), I got the following output:

(PS: Some duplicate numbers are removed.)

---------- iou_thresh: 0.250000 ----------
eval cabinet Average Precision: 0.368814
eval bed Average Precision: 0.887401
eval chair Average Precision: 0.890175
eval sofa Average Precision: 0.893592
eval table Average Precision: 0.589219
eval door Average Precision: 0.457754
eval window Average Precision: 0.376980
eval bookshelf Average Precision: 0.456502
eval picture Average Precision: 0.059759
eval counter Average Precision: 0.562570
eval desk Average Precision: 0.665162
eval curtain Average Precision: 0.465986
eval refrigerator Average Precision: 0.502340
eval showercurtrain Average Precision: 0.594321
eval toilet Average Precision: 0.975494
eval sink Average Precision: 0.444091
eval bathtub Average Precision: 0.889807
eval garbagebin Average Precision: 0.381119
eval mAP: 0.581172
eval cabinet Recall: 0.758065
eval bed Recall: 0.950617
eval chair Recall: 0.928363
eval sofa Recall: 0.989691
eval table Recall: 0.837143
eval door Recall: 0.710921
eval window Recall: 0.624113
eval bookshelf Recall: 0.844156
eval picture Recall: 0.234234
eval counter Recall: 0.865385
eval desk Recall: 0.929134
eval curtain Recall: 0.701493
eval refrigerator Recall: 0.964912
eval showercurtrain Recall: 0.857143
eval toilet Recall: 1.000000
eval sink Recall: 0.622449
eval bathtub Recall: 0.935484
eval garbagebin Recall: 0.688679
eval AR: 0.802332

---------- iou_thresh: 0.500000 ----------
eval cabinet Average Precision: 0.077516
eval bed Average Precision: 0.824878
eval chair Average Precision: 0.676398
eval sofa Average Precision: 0.705267
eval table Average Precision: 0.423883
eval door Average Precision: 0.158062
eval window Average Precision: 0.074264
eval bookshelf Average Precision: 0.307388
eval picture Average Precision: 0.008684
eval counter Average Precision: 0.117078
eval desk Average Precision: 0.341620
eval curtain Average Precision: 0.217135
eval refrigerator Average Precision: 0.309121
eval showercurtrain Average Precision: 0.044384
eval toilet Average Precision: 0.874484
eval sink Average Precision: 0.233770
eval bathtub Average Precision: 0.820859
eval garbagebin Average Precision: 0.137844
eval mAP: 0.352924
eval cabinet Recall: 0.368280
eval bed Recall: 0.864198
eval chair Recall: 0.760965
eval sofa Recall: 0.824742
eval table Recall: 0.605714
eval door Recall: 0.364026
eval window Recall: 0.166667
eval bookshelf Recall: 0.610390
eval picture Recall: 0.036036
eval counter Recall: 0.307692
eval desk Recall: 0.700787
eval curtain Recall: 0.283582
eval refrigerator Recall: 0.684211
eval showercurtrain Recall: 0.214286
eval toilet Recall: 0.913793
eval sink Recall: 0.316327
eval bathtub Recall: 0.870968
eval garbagebin Recall: 0.347170
eval AR: 0.513324

If my understanding is correct, the performance of VoteNet seems to be mAP=58.12% at IoU=0.25, mAP=35.29% at IoU=0.5, which is significantly higher than what is reported in the paper. Is there anything wrong with my understanding on the results? Thanks a lot for your answer.

Best,
Ken

Question about the voting module architecture

Hi,
According to the paper description of the voting module:
The voting module MLP has output sizes of 256, 256, 259 for its fully connected layers. The last fully connected layer does not have ReLU or BatchNorm.

However, it seems that the FC layers in voting_module.py are replaced by convolutional layers :
self.conv1 = torch.nn.Conv1d(self.in_dim, self.in_dim, 1)
self.conv2 = torch.nn.Conv1d(self.in_dim, self.in_dim, 1)
self.conv3 = torch.nn.Conv1d(self.in_dim, (3+self.out_dim) * self.vote_factor, 1)

Is either known to perform better than the other? And how does the parameter sharing work in case of the Conv1d layers?
Thanks

process still killed when runing extract_rgbd_data_v2.m

I ran this process on the server.Free memory is 16100MiB.After about 20 minutes,I got a Segmentation violation detected.And the process also killed.Before ran this process,I have
already comment the SUNRGBD2Dseg.mat the scripts works.But the process still killed.The other two .m process can be run.
is there still anyway to skip this step for preparing SUN-RGB-D data?

Color bounding boxes

Dear authors,

I was wondering in Meshlab how can I assign different colors to bounding boxes within different semantic classes, like the way you did in your paper? The following picture is what I can get now and it's quite annoying all the bounding boxes are in the same color.

Thanks in advance!

snapshot00

Why training on SUN rgbd v1 rather than v2?

Hi, I visualized the 3d label boxes of sunrgbd V1 and found that there were some unreasonable 3d label boxes which contained the whole scene with "table" label. While there was no such problem in sunrgbd v2.

Could you please tell me why you utilized v1 as training and testing set rather than v2?

Some questions about training network

Hello!
Thanks for your great job. I trained the network on SUN RGB-D data and got great reasults.
Now I want to train the network on my dataset(I have point cloud "XYZ coordinates", 3d bbox "XYZ coordinates of 8 vertices" and class of each 3d bbox).
I think I need to generate GT votes. These votes should be the coordinates of objects' center, right?
What puzzles me is how many votes should be generated for each objects.
In addition, what do I need to add or modify to train this network?
Could you give me some good advice? I look forward to your reply. Thank you!

The number of provided proposals using 'per_class_proposal'

Hi Charles,

I am a bit confused about 'per_class_proposal'. According to Figure 8 in the Votenet paper, it seems that we should assign each proposal to all the 10 classes, pick top K ranked (e.g., 256) proposals, and then NMS.

Meanwhile the released codes duplicate proposals after NMS to 10 classes and look somehow different from paper's description. Please correct me if I misunderstand something.

votenet/models/ap_helper.py

Lines 169 to 179 in 257b8d8

for i in range(bsize):
if config_dict['per_class_proposal']:
cur_list = []
for ii in range(config_dict['dataset_config'].num_class):
cur_list += [(ii, pred_corners_3d_upright_camera[i,j], sem_cls_probs[i,j,ii]*obj_prob[i,j]) \
for j in range(pred_center.shape[1]) if pred_mask[i,j]==1 and obj_prob[i,j]>config_dict['conf_thresh']]
batch_pred_map_cls.append(cur_list)
else:
batch_pred_map_cls.append([(pred_sem_cls[i,j].item(), pred_corners_3d_upright_camera[i,j], obj_prob[i,j]) \
for j in range(pred_center.shape[1]) if pred_mask[i,j]==1 and obj_prob[i,j]>config_dict['conf_thresh']])
end_points['batch_pred_map_cls'] = batch_pred_map_cls

weird kernel dispatch for gradient of 3-interpolate

It appears that gradient kernel was never actually used cuz here it's dispatching the same kernel as in forward:

at::Tensor three_interpolate(at::Tensor points, at::Tensor idx,
at::Tensor weight) {
CHECK_CONTIGUOUS(points);
CHECK_CONTIGUOUS(idx);
CHECK_CONTIGUOUS(weight);
CHECK_IS_FLOAT(points);
CHECK_IS_INT(idx);
CHECK_IS_FLOAT(weight);
if (points.type().is_cuda()) {
CHECK_CUDA(idx);
CHECK_CUDA(weight);
}
at::Tensor output =
torch::zeros({points.size(0), points.size(1), idx.size(1)},
at::device(points.device()).dtype(at::ScalarType::Float));
if (points.type().is_cuda()) {
three_interpolate_kernel_wrapper(
points.size(0), points.size(1), points.size(2), idx.size(1),
points.data<float>(), idx.data<int>(), weight.data<float>(),
output.data<float>());
} else {
AT_CHECK(false, "CPU not supported");
}
return output;
}

at::Tensor three_interpolate_grad(at::Tensor grad_out, at::Tensor idx,
at::Tensor weight, const int m) {
CHECK_CONTIGUOUS(grad_out);
CHECK_CONTIGUOUS(idx);
CHECK_CONTIGUOUS(weight);
CHECK_IS_FLOAT(grad_out);
CHECK_IS_INT(idx);
CHECK_IS_FLOAT(weight);
if (grad_out.type().is_cuda()) {
CHECK_CUDA(idx);
CHECK_CUDA(weight);
}
at::Tensor output =
torch::zeros({grad_out.size(0), grad_out.size(1), m},
at::device(grad_out.device()).dtype(at::ScalarType::Float));
if (grad_out.type().is_cuda()) {
three_interpolate_kernel_wrapper(
grad_out.size(0), grad_out.size(1), grad_out.size(2), m,
grad_out.data<float>(), idx.data<int>(), weight.data<float>(),
output.data<float>());
} else {
AT_CHECK(false, "CPU not supported");
}
return output;
}

Specifically I think this line:

three_interpolate_kernel_wrapper(

should have been three_interpolate_grad_kernel_wrapper instead? Is this a mistake or intentionally done as such?

OSError: CUDA_HOME environment variable is not set

Hi,when I use "python setup.py install" to Compile the CUDA layers for PointNet++, I meet a problem:
Traceback (most recent call last): File "setup.py", line 24, in <module> "nvcc": ["-O2", "-I{}".format("{}/include".format(_ext_src_root))], File "/home/dai/.local/lib/python3.5/site-packages/torch/utils/cpp_extension.py", line 328, in CUDAExtension library_dirs += library_paths(cuda=True) File "/home/dai/.local/lib/python3.5/site-packages/torch/utils/cpp_extension.py", line 393, in library_paths paths.append(_join_cuda_home(lib_dir)) File "/home/dai/.local/lib/python3.5/site-packages/torch/utils/cpp_extension.py", line 724, in _join_cuda_home raise EnvironmentError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

could you tell me how to solve this problem? Please~

Generating the meta data for more ScanNet objects

Hi there,

I'm wondering if there's any way to generate a new "scannet_means.npz" file for some more ScanNet objects? Currently, I'm trying to perform the object detection task via Votenet on more objects, but the "scannet_means.npz" only support objects from the 18 NYU40 classes.

I'd be very appreciative if someone can give me a hint. :)

Cheers,
Dave

How to use Lidar Point clouds for VoteNet?

I am really new to the field and don't have much comprehension of point clouds.
I want to use LiDAR point clouds (LPC) like nuScenes dataset to train on VoteNet.
I read the part in the readme file where a few instruction where given for 'Pro Users' to convert LPC in the desired format, but i didn't really understand how to follow them.
Can anyone guide me?

Problems with trimesh

Does anyone encounter problems when trying to dump the scene?
I installed trimesh = 2.35.39 and I get the following problem :

Finished detection. 37 object detected.
Traceback (most recent call last):
File "demo.py", line 101, in
MODEL.dump_results(end_points, dump_dir, DC, True)
File "D:\Projects\Depth\votenet\models\dump_helper.py", line 86, in dump_results
pc_util.write_oriented_bbox(obbs[objectness_prob>DUMP_CONF_THRESH,:], os.path.join(dump_dir, '%06d_pred_confident_bbox.ply'%(idx_beg+i)))
File "D:\Projects\Depth\votenet\utils\pc_util.py", line 421, in write_oriented_bbox
mesh_list = trimesh.util.concatenate(scene.dump())
File "d:\ProgramData\Anaconda3\envs\pytorch_1_1_python36\lib\site-packages\trimesh\scene\scene.py", line 499, in dump
for node_name in self.graph.nodes_geometry:
File "d:\ProgramData\Anaconda3\envs\pytorch_1_1_python36\lib\site-packages\trimesh\caching.py", line 88, in get_cached
value = function(*args, **kwargs)
File "d:\ProgramData\Anaconda3\envs\pytorch_1_1_python36\lib\site-packages\trimesh\scene\transforms.py", line 224, in nodes_geometry
'geometry' in self.transforms.node[n]):
AttributeError: 'EnforcedForest' object has no attribute 'node'

Any ideas ?

Some question about sunrgb

Hi, when I run the extract_rgbd_data_v2.m, the load('SUNRGBD2Dseg.mat') always is faild. Could you help me? Thank you very much.

Boxnet

Hi! Is there a way to train the boxnet model instead of votenet?

How is the value of K (the number of clusters) determined ?

Hi,

Thank you for sharing this repo. I'm wondering how did you select the value for K --- the number of clusters, which I think is the parameter num_proposal in the VoteNet class, but correct me if I'm wrong.

Is it just an arbitrary fixed number? Have you guys observed any performance changes if modifying this parameter?

Thanks!

Loss going down with training but precision and recall stay at zero

I tried training Votenet for some outdoor scene data we have. The weird thing I am noticing is that though the loss has gone down considerably with nearly a day of training, the precision and recall numbers are consistently staying at zero.

Any ideas/insights?

eval mean box_loss: 1.709604
eval mean center_loss: 1.709604
eval mean heading_cls_loss: 0.000000
eval mean heading_reg_loss: 0.000000
eval mean loss: 104.791534
eval mean neg_ratio: 0.990234
eval mean obj_acc: 1.000000
eval mean objectness_loss: 0.000164
eval mean pos_ratio: 0.000000
eval mean sem_cls_loss: 0.000000
eval mean size_cls_loss: 0.000000
eval mean size_reg_loss: 0.000000
eval mean vote_loss: 8.769468
0 0
2 0
4 0
1 0
5 0
3 0
eval Car Average Precision: 0.000000
eval Tree Average Precision: 0.000000
eval Building Average Precision: 0.000000
eval mAP: 0.000000
eval Car Recall: 0.000000
eval Tree Recall: 0.000000
eval Building Recall: 0.000000

Weirdly enough, I got these zero values for precision and recall, even when I used a couple of scenes from the training data itself for evaluation. At least for those, I would have expected the recall & precision to be good. Of course, eventually, the goal is to evaluate it on data that hasn't been used for training. This was just a test to see whether it gives non-zero precision/recall values at least for scenes used in training.

Precision of Single Point Cloud

Hello, I'd like to know the precision of a single point cloud. Do you know how to modify eval.py so that it can only calculate the precision of a single point cloud?

process killed when runing extract_rgbd_data_v2.m

I'm using ubuntu 16.04 and matlab 2019a ,
the process " extract_rgbd_data_v2.m" just show "killed" after a long time waiting

It seems I don't have enough memory to load SUNRGBD2Dseg.mat , in my ubuntu is only has MemTotal: 32646712 kB
MemFree: 31509404 kB
MemAvailable: 31353204 kB

is there anyway to skip this step for preparing SUN-RGB-D data?

Getting lower result using pretrained model

Dear authors,

I tried to evaluate ScanNet-pretrained model you provided as illustrated in REANDME and I got 57.3156 [email protected] and 34.0198 [email protected], but according to issue #11 this result is obvious lower than expected, even lower than my self-reproduced model on ScanNet which could reach 57.6734 [email protected] and 34.3470 [email protected].

This is quite weird that the same pretrained model on the same dataset gives different results and I can't tell what has happened here. Now I have a guessing that this may relate to data corruption or something like that because I experienced a network cut-off when downloading ScanNet, but I'm not sure about it. Could you please provide some ideas of this?

---- log_eval of pretrained model ----

Namespace(DUMP_DIR='demo_files/eval_pretrained_scannet', ap_iou_thresholds='0.25,0.5', batch_size=8, checkpoint_path='demo_files/pretrained_votenet_on_scannet.tar', cluster_sampling='seed_fps', conf_thresh=0.05, dataset='scannet', dump_dir='demo_files/eval_pretrained_scannet', faster_eval=False, model='votenet', nms_iou=0.25, no_height=False, num_point=40000, num_target=256, per_class_proposal=True, shuffle_dataset=False, use_3d_nms=True, use_cls_nms=True, use_color=False, use_old_type_nms=False, use_sunrgbd_v2=False, vote_factor=1)
Loaded checkpoint demo_files/pretrained_votenet_on_scannet.tar (epoch: 120)
2019-09-25 07:57:06.985165
eval mean box_loss: 0.132122
eval mean center_loss: 0.041772
eval mean heading_cls_loss: 0.000000
eval mean heading_reg_loss: 0.000000
eval mean loss: 6.451989
eval mean neg_ratio: 0.421337
eval mean obj_acc: 0.847017
eval mean objectness_loss: 0.116719
eval mean pos_ratio: 0.335211
eval mean sem_cls_loss: 0.498210
eval mean size_cls_loss: 0.498671
eval mean size_reg_loss: 0.040483
eval mean vote_loss: 0.404896
eval cabinet Average Precision: 0.360204
eval bed Average Precision: 0.880230
eval chair Average Precision: 0.872860
eval sofa Average Precision: 0.908269
eval table Average Precision: 0.586539
eval door Average Precision: 0.464863
eval window Average Precision: 0.356241
eval bookshelf Average Precision: 0.414334
eval picture Average Precision: 0.067318
eval counter Average Precision: 0.506180
eval desk Average Precision: 0.654182
eval curtain Average Precision: 0.439923
eval refrigerator Average Precision: 0.472297
eval showercurtrain Average Precision: 0.526015
eval toilet Average Precision: 0.956763
eval sink Average Precision: 0.549155
eval bathtub Average Precision: 0.910629
eval garbagebin Average Precision: 0.390812
eval mAP: 0.573156
eval cabinet Recall: 0.750000
eval bed Recall: 0.950617
eval chair Recall: 0.912281
eval sofa Recall: 0.989691
eval table Recall: 0.822857
eval door Recall: 0.704497
eval window Recall: 0.609929
eval bookshelf Recall: 0.844156
eval picture Recall: 0.238739
eval counter Recall: 0.846154
eval desk Recall: 0.929134
eval curtain Recall: 0.731343
eval refrigerator Recall: 0.964912
eval showercurtrain Recall: 0.821429
eval toilet Recall: 1.000000
eval sink Recall: 0.724490
eval bathtub Recall: 0.967742
eval garbagebin Recall: 0.694340
eval AR: 0.805684
eval cabinet Average Precision: 0.064891
eval bed Average Precision: 0.799895
eval chair Average Precision: 0.674196
eval sofa Average Precision: 0.681848
eval table Average Precision: 0.446467
eval door Average Precision: 0.152873
eval window Average Precision: 0.088181
eval bookshelf Average Precision: 0.286832
eval picture Average Precision: 0.015800
eval counter Average Precision: 0.140754
eval desk Average Precision: 0.310386
eval curtain Average Precision: 0.138233
eval refrigerator Average Precision: 0.217701
eval showercurtrain Average Precision: 0.105123
eval toilet Average Precision: 0.751489
eval sink Average Precision: 0.256540
eval bathtub Average Precision: 0.848165
eval garbagebin Average Precision: 0.144188
eval mAP: 0.340198
eval cabinet Recall: 0.341398
eval bed Recall: 0.876543
eval chair Recall: 0.751462
eval sofa Recall: 0.804124
eval table Recall: 0.628571
eval door Recall: 0.376874
eval window Recall: 0.212766
eval bookshelf Recall: 0.662338
eval picture Recall: 0.049550
eval counter Recall: 0.307692
eval desk Recall: 0.637795
eval curtain Recall: 0.223881
eval refrigerator Recall: 0.596491
eval showercurtrain Recall: 0.285714
eval toilet Recall: 0.827586
eval sink Recall: 0.387755
eval bathtub Recall: 0.870968
eval garbagebin Recall: 0.375472
eval AR: 0.512054

some problem about the figure

Thank you very much for your good work. I want to know which software did you use to get the figure in the paper? mayavia 、meshlab or another one?
webwxgetmsgimg

Get the class label for the predicted bounding boxes

Hi,

Can someone give some suggestions on how to get the object label for the predicted bounding boxes? I assume it should be in *_pred_map_cls.txt files, but I can't figure out how to interpret the parameters inside the text files.

Thanks!

why use "sa1_inds"?

in backbond_module.py why use 'sa1_inds' ? I think it's right to use "sa2_inds"

Results on the val set

Hi, as reported in the paper, the results are tested on the val set for both SUNRGBD and ScanNet dataset. The evaluation is ran every 10 epoches, but the result varies over a small range when the learning is coveraged.
My question is how to choose the final result to report? The highest one or an average value on some epoches, or something esle?

Why do some categories score so low?

Hi,I am reading this paper and find some problem. Do you know why are the mAP scores of the dresser, bookshelf and desk is very low? Is it due to data set problems or these objects are more difficult to detect?

Some Questions about votenet

If the camera has an arbitary orientation,can votenet predict 3dof orientation directly in camera coordinate system?

Why set FLAG ‘per_class_proposal’ as true?

HI, why set per_class_proposal as true in evaluation time? When it is true, all boxes would bed added to each class with different score.

Why it is helpful to achieve better results?

Upgrading to PyTorch 1.3

Your Readme file says "Note: there is some incompatibility with newer version of Pytorch (e.g. v1.3), which is to be fixed".

Are you planning to do that in the near future?

My motivation to upgrade to the latest version of Pytorch (1.3) is that as of now, the following error message pops up sporadically while training votenet. Based on searching on the internet, this error seems to have been fixed with the latest version of Pytorch.

File "...l\Continuum\anaconda3\envs\votenet\lib\site-packages\torch\serialization.py", line 580, in _load
deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 8 more bytes. The file might be corrupted.

syntax error

Hi, When I run python models/votenet.py, I meet the following problem:

File "/home/magic/tanglv/assignment6/votenet/utils/pc_util.py", line 484
trimesh.io.export.export_mesh(mesh_list, f'{filename}.ply', file_type='ply')
^
SyntaxError: invalid syntax

can you tell me the syntax error?

Training time on SUN RGBD Dataset

Hi all,

I was wondering what's the training time of votenet on the sun rgbd dataset. According to my experience, using one 1080 Ti GPU on an Ubuntu 16.4 server, it would take around a day. In the paper of votenet, it's about 10 hours on a Volta Quadro GP100 GPU.

Is this difference caused by GPU version, or did I set anything wrong? I have closed the evaluation process during training. Any comment is appreciated.

Best

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.