autonomousvision / convolutional_occupancy_networks Goto Github PK
View Code? Open in Web Editor NEW[ECCV'20] Convolutional Occupancy Networks
Home Page: https://pengsongyou.github.io/conv_onet
License: MIT License
[ECCV'20] Convolutional Occupancy Networks
Home Page: https://pengsongyou.github.io/conv_onet
License: MIT License
Hi,
Is there a way to apply this approach on point clouds of urban scene and get 3D reconstruction?
Hi,
I noted that during training (on Shapenet), the input points for feature extraction are loaded from "points.npz" while the query
points are loaded from "pointcloud.npz". I am wondering what the difference is between the two files?
I found that the points data (i.e. the value of the key 'points') in these two files seem not to be consistent, which would make no sense for the training.
Thanks in advance!
I think it should be if walls[3] == 1
. Is it a typo?
Are you planning to release the config files for training the models from scratch on different datasets? I am trying to reproduce the results for Shapenet in 32**3 grid but the config file for training is not present in the repo and the generation config files does not contain all the fields as the training file.
Hi, I just tried training the synthetic indoor scene with datasets downloaded from the link. It worked well. I want to compare the generated mesh with the ground-truth mesh, however, what I can find is some point clouds that seem to be the ground truth. Therefore, could you please help me with the ground-truth mesh of the synthetic indoor scene? Thanks a lot.
Hi there,
It seems like when calling add_key
function, usually nothing is happening.
add_key
is defined as:
`def add_key(base, new, base_name, new_name, device=None):
''' Add new keys to the given input
Args:
base (tensor): inputs
new (tensor): new info for the inputs
base_name (str): name for the input
new_name (str): name for the new info
device (device): pytorch device
'''
if (new is not None) and (isinstance(new, dict)):
if device is not None:
for key in new.keys():
new[key] = new[key].to(device)
base = {base_name: base,
new_name: new}
return base
Which means that only when
new` is a dictionary, things will happen. But when it's called in the code, for example here:
(PS: the documentation of add_key
says new
should be a tensor, but the code does something only when new
is a dict)
Thanks.
Because the counter is incremented before the first line is written to the string the split lists never contain scene 0, they start at 1.
In the line mentioned below, which is a part of documentation, it is mentioned that you are using conditional batch norm. However, in the code, you are using residual blocks without any batch norm. It seems that it is a legacy documentation. Please let me know, if you are actually intending to use batchnorm (conditional or otherwise).
Edit: I would really appreciate if you can also give some intuition about why did you not use batch normalization at all?
Hi~
If I want to train a model on outdoor data like KITTI, which provide 64 line Velodyne LiDAR point cloud.
do you know how can I generate groundtruth occupancy values for a point cloud ?
gcc: error: /home/deploy/tmp/pycharm_project_794/convolutional_occupancy_networks/build/temp.linux-x86_64-3.7/src/utils/libkdtree/pykdtree/kdtree.o: No such file or directory
gcc: error: /home/deploy/tmp/pycharm_project_794/convolutional_occupancy_networks/build/temp.linux-x86_64-3.7/src/utils/libkdtree/pykdtree/_kdtree_core.o: No such file or directory
Dear authors,
Thank you for this nice work and sharing the code with the community! I wonder if there are any post-processing steps done (e.g. smoothing the meshes) after the 3D reconstruction, which might not be mentioned in the paper.
Best regards
I would like to know the format to train convolutional occupancy network for a custom dataset. I have the sparse point cloud for my custom dataset.
Would appreciate any help for the same.
Hello Songyou, thank you for such a good paper and the code to go along with it.
The paper mentions the canonical plane is discretized as H*W during feature aggregation. What do H and W represent for point cloud inputs?
Hi @pengsongyou,
thank you for open-sourcing the code. I would have two questions regarding the evaluation:
refine_mesh
, which performs an optimization of the mesh after MC. Is this function used for the evaluation or visualizations? In all config files refine is set to False.Thanks in advance,
Zan
torch_scatter/init.py", line 14, in
f'{library}_{suffix}', [osp.dirname(file)]).origin)
AttributeError: 'NoneType' object has no attribute 'origin'
Hi, thanks for sharing your codes. I'm training 3planes and grid64 on the synthetic room dataset, How many epoches do you train to get the final numbers? Thanks!
Hello @pengsongyou thank you so much for the code, could you provide some explanation for the coordinate2index, It is my first time working with point clouds and I am little confused with that function.
Thanks.
Hi, I'm wondering why normalize_3d_coordiante is needed if the point coordinate is in range of [-0.5, 0.5]
?
Hi,
I'm following your installation instructions, I'm having trouble with the build of pykdtree (the rest of the extensions build OK). Reproduction below, I'm on Ubuntu 20.04 with miniconda (not sure what else here is relevant). Is there a particular build tool that I need to install to get this to work?
With the conda env activated (built from the environment yaml), I run this from the project's root folder:
python setup.py build_ext --inplace
The output is this:
running build_ext
building 'src.utils.libkdtree.pykdtree.kdtree' extension
Emitting ninja build file /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/src/utils/libkdtree/pykdtree/_kdtree_core.o.d -pthread -B /home/yishai/miniconda3/envs/conv_onet/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yishai/miniconda3/envs/conv_onet/lib/python3.8/site-packages/numpy/core/include -I/home/yishai/miniconda3/envs/conv_onet/include/python3.8 -c -c /home/yishai/repos/convolutional_occupancy_networks/src/utils/libkdtree/pykdtree/_kdtree_core.c -o /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/src/utils/libkdtree/pykdtree/_kdtree_core.o -std=c99 -O3 -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=kdtree -D_GLIBCXX_USE_CXX11_ABI=1
FAILED: /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/src/utils/libkdtree/pykdtree/_kdtree_core.o
c++ -MMD -MF /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/src/utils/libkdtree/pykdtree/_kdtree_core.o.d -pthread -B /home/yishai/miniconda3/envs/conv_onet/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yishai/miniconda3/envs/conv_onet/lib/python3.8/site-packages/numpy/core/include -I/home/yishai/miniconda3/envs/conv_onet/include/python3.8 -c -c /home/yishai/repos/convolutional_occupancy_networks/src/utils/libkdtree/pykdtree/_kdtree_core.c -o /home/yishai/repos/convolutional_occupancy_networks/build/temp.linux-x86_64-3.8/src/utils/libkdtree/pykdtree/_kdtree_core.o -std=c99 -O3 -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=kdtree -D_GLIBCXX_USE_CXX11_ABI=1
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-std=c99’ is valid for C/ObjC but not for C++
/home/yishai/repos/convolutional_occupancy_networks/src/utils/libkdtree/pykdtree/_kdtree_core.c:49:3: **error: conflicting declaration** ‘typedef struct Node_float Node_float’
49 | } Node_float;
| ^~~~~~~~~~
.... etc. etc. more of these compiler errors.
Hi @pengsongyou.
Can you comment on how your model makes consistent predictions around crop boundaries?
Hi there, thank you for making the code public!
I have a question about data sampling strategy in synthetic indoor scene dataset.
After placing the object in the room, do you randomly sample points and query their ground truth occupancy in the whole scene? If so, wouldn't the occupancy data severely biased toward empty (since most space inside the room should be empty)?
Another question is that: how many point-occupancy pairs do you sample for one batch?
I'm not able to clearly locate these details in the code, so I raise the issue to ask. Thank you for your time!
hello Your work is great and i am very interested in your article, and I have some questions to ask you.
I have five GPUs, how to call multiple GPUs to run code during training @pengsongyou
Thank you very much for this repo and your paper.
This is not so much an issue as much as it is shared tips for others who, like me, may be struggling to run this code on latest Nvidia graphics cards. My specs are:
Ubuntu 18.04
2 x Nvidia RTX 3090
g++ 9.4/gcc 8.4/ninja 1.8
Specifically I had two main issues:
My solution was:
$PATH
and $LD_LIBRARY_PATH
environment variables are configured correctly (see instruction 9.1.1 here - but pointing to cuda 11.1 instead of 11.4)conda create -n my_conv_onet python=3.6
and conda activate my_conv_onet
pip install numpy cython
setup.py
file by:torch.utils.cpp_extension
cmd_class
argument from the setup
functionpython setup.py build_ext --inplace
pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install matplotlib tensorboardx trimesh pyyaml tqdm
This should then provide you with a fully installed environment from which to run the code.
I will print my conda list
in a comment below.
Hi,
Thank you for making the code for this nice project publicly available.
If I have a point cloud stored in a generic .ply file, how can I process it in such a way that I can run inference with the model that was trained on matterport? So is there some "build_dataset.py" file that is not for a specific dataset, but can just be used for a normal .ply file?
Best,
Duncan
Hi:
thanks for making this great work available here!
I want to know the total training time for 3000 epochs for the model in this paper (or if you use a single GPU ? ). I find it's so long for me to train the model on a single 2080Ti.
When I evaluate the generated ShapeNet meshes myself, I found that the chamfer-l1 I got was an order of magnitude smaller than the value reported in the paper (I got 0.0048 instead of 0.048, and I'm using the official evaluation script in this repo). So I'm wondering if the chamfer-l1 metric reported in the paper is the result of multiplying the real value by 10?
Hi @pengsongyou Thanks a lot for sharing this excellent work. I wonder would you mind also share the code for Screened Poisson Surface Reconstruction? or did you use some open-source implementation?
Best,
Xuyang
Hi,
I am interested in running the code on matterport data. But I can't find the points.npy file similar to shapenet dataset which has points sampled randomly inside the bounds. Can you please help me with this?
Thanks
Shamit
Hi, I'm using generate.py
to reconstruct mesh with my own 3D pointcloud.npz
file. Since the result isn't ideal, I want to check out the point cloud generated but failed. I'm wondering is it because I used the wrong config file or some other reasons so that hasattr(generator, 'generate_pointcloud')
is always False
? Thank you!
Hi! Thanks for your great work! Recently I want to apply your method to 3D reconstruction from very sparse input point clouds (often less than 1000 points). And I discover that you use 3000 points in "configs/pointcloud/shapenet_3plane.yaml". Actually I've tried ONet with 300 input points before and its performance in my task is okay. So I wonder have you tried ConvONet with fewer input points, and what's its IoU result?
I understand that because you need to project point features to grids and then perform convolution, so more points would be better. I just wonder if you've tried other input point number in your experiments and whether its performance is sensitive to this. Thank you!
Hi, I'm wondering how you visualize the point cloud as a set of small spheres as in the paper's figures? I found it is too hard to distinguish the points if I directly render the points. Could you please share the visualization tool or source code? Thanks a lot!
Hi,
would it be possible to share the watertight models of the ShapeNet subset you used?
Kind regards!
Hi Pengsong,
Could you please send me the evaluation setting on Scannet (scannet.yaml)?
Many thanks in advance!
Best,
Bing
Hi,
When I run the following script:
python generate.py configs/pointcloud_crop/demo_matterport.yaml
runtimeError: CUDA out of memory. Tried to allocate 376.00 MiB
I tried reducing the batch size from 2 to 1 but still not working.
Any solution?
Hi! It's nice to see your work on implicit representation of 3D shapes. May I have your idea on how the (convolutional) occupancy network can be used in real applications?
For example, if a robot wants to grab a chair in front of it with its arm, does it need to query all the points in between itself and the target item (the chair) so as to precisely determine the coordinates of the legs? In other words, how can the occupany network be used to calculate the distance?
Thank you.
Hi,
Thanks for your excellent work!
Since it seems that the point cloud are still added noise during inference, I am wondering if this operation is just used for evaluation.
Thus, I am wondering whether the mesh reconstructed might be better if I input a totally clean point cloud?
Hi,
thanks for making this great work available here!
I would like to test one of the pretrained models on my own dataset.
However, I am a bit lost in writing a corresponding config file.
For testing, I simply copied a single points.npz file from ShapeNet to a new folder.
I wrote the following myConfig.yaml file:
data:
classes: ['']
path: /home/raphael/data
pointcloud_n: 10000
pointcloud_file: points.npz
voxels_file: null
points_file: null
points_iou_file: null
training:
out_dir: out/mine
test:
model_file: https://s3.eu-central-1.amazonaws.com/avg-projects/convolutional_occupancy_networks/models/pointcloud/shapenet_3plane.pt
generation:
generation_dir: generation
When running python generate.py config/myConfig.yaml
I get the following error:
cfg_special = yaml.load(f)
/home/raphael/remote_python/convolutional_occupancy_networks/src/config.py:33: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
cfg = yaml.load(f)
Traceback (most recent call last):
File "generate.py", line 38, in
dataset = config.get_dataset('test', cfg, return_idx=True)
File "/home/raphael/remote_python/convolutional_occupancy_networks/src/config.py", line 134, in get_dataset
inputs_field = get_inputs_field(mode, cfg)
File "/home/raphael/remote_python/convolutional_occupancy_networks/src/config.py", line 202, in get_inputs_field
'Invalid input type (%s)' % input_type)
ValueError: Invalid input type (img)
Could you give me a hint on how to achieve what I want to do?
Kind regards!
Hi,
thank you very much for making this great work available here.
I was experimenting with retraining your network for which I downloaded the preprocessed ShapeNet and synthetic room dataset.
You state in the paper that you "sample 3000 points from the mesh and apply Gaussian noise with zero mean and standard deviation 0.05" to generate noisy input point clouds. (I understand the reasoning in this issue, and can see in the config files that it is actually 0.005.)
However, to me it seems like there is no added noise in neither the pointcloud.ply nor pointcloud.npz files in both the ShapeNet and synthetic room dataset. If I follow your approach correctly these files are generated with the sample_mesh.ply script from the occupancy networks repository. In the export_points function for generating occupancy samples noise is added, however not in the export_pointcloud function, which is used as reconstruction input. Shouldn't this be the case, since you postulate to reconstruct meshes from noisy point clouds?
Kind regards
Hi, Thanks for sharing this great work!
I would like to ask whether we could get access to the ground truth meshes of the synthetic indoor scene dataset. I might need to resample the field labels from your GT meshes. Thanks!
Hi,
I have successfully trained ConvONet on my own object datasets before with nice results!
I have used the sample_mesh.py script from ONet to generate pointcloud.npz
and points.npz
(occupancy points) files.
I would now like to train the network to reconstruct very large scenes, for which I have ground truth meshes for some of them. How do I go about this?
Can I simply provide one points.npz
and pointcloud.npz
file per scene? How do I make sure to have enough occupancy samples per crop? Should I simply make sure to have 100k occupancy points per crop defined by voxel_size * resolution
?
Or do I need to do the cropping myself?
Kind regards
Hi, I'm wondering what does these two lines do? so the input points to the decoder has a shape of B x num_points x 3
and line 60 changes it to B x num_points x 1 x 1 x 3
? Also why do you need to normalize it from [0, 1]
to [-1, 1]
for the grip sampling? Thanks
convolutional_occupancy_networks/src/conv_onet/models/decoder.py
Lines 60 to 61 in f44d413
Dear authors,
Thank you for this nice work and sharing the code with the community! I wonder which '.yaml' file should follow the 'python train.py' command if I retrain the model myself?
Best regards
Thanks for open sourcing the code of ConvONet. This has been very helpful.
Regarding the custom Dataset, I was wondering if I want to run the ConvONet pre-trained models on a custom dataset, which format should I have it?
Currently, I have point clouds (.ply files) of multiple buildings/indoor scenes. Each point cloud is about 500 MB. Should I slice them into smaller point clouds or running on a large should be fine too? Since in the demo, I can see the input point clouds are much smaller (around 2 MB).
Please correct me if I'm wrong, but the output of the network will generate corresponding meshes of the point clouds as well as reconstruction of the whole point cloud (without noise) as well?
Thank you.
I have some problems when running the code, I hope you can provide the version of 'gcc' and 'cuda'.
Hi,
In order to set up the conda environment by running the following command :
conda env create -f environment.yaml
It gives the following warning :
"Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you"
I did not pay attention to the warning earlier but then I had to manually install a lot of packages (probably the one that should be installed using pip), by searching I came to know this issue that earlier conda used to come with pip installed but now it should be mentioned explicitly.
I've edited the environment.yaml file in the following way to remove the warning and installing all the libraries as well as pip correctly. Opening this issue here to save time, in case someone is facing version conflicts same as me.
dependencies:
- pip
- cython=0.29.2
- imageio=2.4.1
.........
Let me know, in case I'm doing something wrong or there is better way to do this. Thank you
Hi Songyou,
Thanks for sharing your brilliant work!
I would like to ask two quick questions here. Hopefully this will not take you too much time :)
Hi, thanks for sharing your great work!
I have a few questions:
You mentioned that "our method is independent of the input representation" in the paper. I am curious to know that if it is possible to use multi-view images as input in your framework. If the answer is yes, could you provide a hint how to implement that?
If train your method in a real-world dataset and test on another real-world dataset, do you think it will still work?
As mentioned in Sec. 6 of the supplementary material, you "randomly sample one point within the scene as the center of the crop". May I know how you balance the positive and negative samples (i.e., gt occupancy value 1 and 0) with random sampling?
Looking forward to your reply. Thanks.
Hi Songyou,
thank you for making your excellent work publicly available.
My question is rather general: When using your architecture, is there any particular reason that keeps you from computing SDF values instead of binary occupancy?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.