facebookresearch / contrastivescenecontexts Goto Github PK

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

License: MIT License

Python 92.68% Shell 1.72% C++ 2.32% Cuda 3.01% C 0.26%

contrastivescenecontexts's Introduction

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

[CVPR 2021 Paper] [Video] [Project Page] [ScanNet Data-Efficient Benchmark]

Environment

This codebase was tested with the following environment configurations.

Ubuntu 20.04
CUDA 10.2
GCC 7.3.0
Python 3.7.7
PyTorch 1.5.1
MinkowskiEngine v0.4.3

Installation

We use conda for the installation process:

# Install virtual env and PyTorch
conda create -n sparseconv043 python=3.7
conda activate sparseconv043
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch

# Complie and install MinkowskiEngine 0.4.3.
conda install mkl mkl-include -c intel
wget https://github.com/NVIDIA/MinkowskiEngine/archive/refs/tags/v0.4.3.zip
cd MinkowskiEngine-0.4.3 
python setup.py install

Next, download Contrastive Scene Contexts git repository and install the requirement from the root directory.

git clone https://github.com/facebookresearch/ContrastiveSceneContexts.git
cd ContrastiveSceneContexts
pip install -r requirements.txt

Our code also depends on PointGroup and PointNet++.

# Install OPs in PointGroup by:
conda install -c bioconda google-sparsehash
cd downstream/insseg/lib/bfs/ops
python setup.py build_ext --include-dirs=YOUR_ENV_PATH/include
python setup.py install

# Install PointNet++
cd downstream/votenet/models/backbone/pointnet2
python setup.py install

Pre-training on ScanNet

Data Pre-processing

For pre-training, one can generate ScanNet Pair data by following code (need to change the TARGET and SCANNET_DIR accordingly in the script).

cd pretrain/scannet_pair
./preprocess.sh

This piece of code first extracts pointcloud from partial frames, and then computes a filelist of overlapped partial frames for each scene. Generate a combined txt file called overlap30.txt of filelists of each scene by running the code

cd pretrain/scannet_pair
python generate_list.py --target_dir TARGET

This overlap30.txt should be put into folder TARGET/splits.

Pre-training

Our codebase enables multi-gpu training with distributed data parallel (DDP) module in pytorch. To train ContrastiveSceneContexts with 8 GPUs (batch_size=32, 4 per GPU) on a single server:

cd pretrain/contrastive_scene_contexts
# Pretrain with SparseConv backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_sparseconv.sh
# Pretrain with PointNet++ backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_pointnet2.sh

ScanNet Downstream Tasks

Data Pre-Processing

We provide the code for pre-processing the data for ScanNet downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation. We use SCANNET_DATA to refer where scannet data lives and SCANNET_OUT_PATH to denote the output path of processed scannet data.

# Edit path variables: SCANNET_DATA and SCANNET_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing/scannet
python collect_indoor3d_data.py --input SCANNET_DATA --output SCANNET_OUT_PATH
# copy the filelists
cp -r split SCANNET_OUT_PATH

For ScanNet detection data generation, please refer to VoteNet ScanNet Data. Run command to soft link the generated detection data (located in PATH_DET_DATA) to following location:

# soft link detection data
cd downstream/det/
ln -s PATH_DET_DATA datasets/scannet/scannet_train_detection_data

For Data-Efficient Learning, download the scene_list and points_list as well as bbox_list from ScanNet Data-Efficient Benchmark. To Active Selection for points_list, run following code:

# Get features per point
cd downstream/semseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/inference_features.sh
# run k-means on feature space
cd lib
python sampling_points.py --point_data SCANNET_OUT_PATH --feat_data PATH_CHECKPOINT

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_SCENE_LIST ./scripts/data_efficient/by_points.sh

Model Zoo

We also provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to downloaded pre-trained model path:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh

Training Data	mIoU (val)	Pre-trained Model Used (for initialization)	Logs	Curves	Model
1% scenes	29.3	download	link	link	link
5% scenes	45.4	download	link	link	link
10% scenes	59.5	download	link	link	link
20% scenes	64.1	download	link	link	link
100% scenes	73.8	download	link	link	link
20 points	53.8	download	link	link	link
50 points	62.9	download	link	link	link
100 points	66.9	download	link	link	link
200 points	69.0	download	link	link	link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_POINTS_LIST ./scripts/data_efficient/by_points.sh

For ScanNet Benchmark, run following code (train on train+val and evaluate on val):

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_benchmark.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh

For submitting to ScanNet Benchmark with our pre-trained model, run following command (the submission file is located in output/benchmark_instance):

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet_benchmark.sh

Training Data	[email protected] (val)	Pre-trained Model Used (for initialization)	Logs	Curves	Model
1% scenes	12.3	download	link	link	link
5% scenes	33.9	download	link	link	link
10% scenes	45.3	download	link	link	link
20% scenes	49.8	download	link	link	link
100% scenes	59.4	download	link	link	link
20 points	27.2	download	link	link	link
50 points	35.7	download	link	link	link
100 points	43.6	download	link	link	link
200 points	50.4	download	link	link	link
train + val	76.5 (64.8 on test)	download	link	link	link

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. Additionally, we provide two backones, namely PointNet++ and SparseConv. To fine-tune the downstream task, run following command:

cd downstream/votenet/
# train sparseconv backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh
# train pointnet++ backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_pointnet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_Scentrain_scannet.sh

For Limited Bbox Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_BBOX=PATH_BBOX_LIST ./scripts/data_efficient/by_bboxes.sh

For submitting to ScanNet Data-Efficient Benchmark, you can set "test.write_to_bencmark=True" in "downstream/votenet/scripts/test_scannet.sh" or "downstream/votenet/scripts/test_scannet_pointnet.sh"

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evaluate our pre-trained model by running following code.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh

Training Data	[email protected] (val)	[email protected] (val)	Pre-trained Model Used (for initialization)	Logs	Curves	Model
10% scenes	9.9	24.7	download	link	link	link
20% scenes	21.4	41.4	download	link	link	link
40% scenes	29.5	52.0	download	link	link	link
80% scenes	36.3	56.3	download	link	link	link
100% scenes	39.3	59.1	download	link	link	link
100% scenes (PointNet++)	39.2	62.5	download	link	link	link
1 bboxes	10.9	24.5	download	link	link	link
2 bboxes	18.5	36.5	download	link	link	link
4 bboxes	26.1	45.9	download	link	link	link
7 bboxes	30.4	52.5	download	link	link	link

Stanford 3D (S3DIS) Fine-tuning

Data Pre-Processing

We provide the code for pre-processing the data for Stanford3D (S3DIS) downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation.

# Edit path variables, STANFORD_3D_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing/stanford
python stanford.py

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh

Training Data	mIoU (val)	Pre-trained Model Used (for initialization)	Logs	Curves	Model
100% scenes	72.2	download	link	link	link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evaluate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh

Training Data	[email protected] (val)	Pre-trained Model Used (for initialization)	Logs	Curves	Model
100% scenes	63.4	download	link	link	link

SUN-RGBD Fine-tuning

Data Pre-Processing

For SUN-RGBD detection data generation, please refer to VoteNet SUN-RGBD Data. To soft link generated SUN-RGBD detection data (SUN_RGBD_DATA_PATH) to following location, run the command:

cd downstream/det/datasets/sunrgbd
# soft link 
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_train sunrgbd_pc_bbox_votes_50k_v1_train
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_val sunrgbd_pc_bbox_votes_50k_v1_val

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. To fine-tune the downstream task, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_sunrgbd.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can load our pre-trained model by setting the pre-trained model path to PATH_CHECKPOINT.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_sunrgbd.sh

Training Data	[email protected] (val)	[email protected] (val)	Pre-trained Model (initialization)	Logs	Curves	Model
100% scenes	36.4	58.9	download	link	link	link

Citing our paper

@inproceedings{hou2021exploring,
  title={Exploring data-efficient 3d scene understanding with contrastive scene contexts},
  author={Hou, Ji and Graham, Benjamin and Nie{\ss}ner, Matthias and Xie, Saining},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={15587--15597},
  year={2021}
}

License

Contrastive Scene Contexts is relased under the MIT License. See the LICENSE file for more details.

contrastivescenecontexts's People

Contributors

Stargazers

Watchers

contrastivescenecontexts's Issues

Which sampling method does ‘Ours(init)’ use in Figure 6?

The original text is 'Ours (init) denotes the network initialization by our pre-trained model'.

The x-axis is limited labeled points, so which sampling method does 'Ours(init)' use to get these limited labeled points?

Thanks!

Which epoch of pretrained model is used for downstream tasks?

Dear authors,

Thanks for your great work!

Which epoch of models during ScanNet pertaining, would be later used for downstream tasks to get better performance?

Thanks,
Mino

Availible Dockerfile

Hi,

First of all, thank you very much for sharing this project. I am glad to learn all the details of your paper from this repo.

For the speed of reproducing, I would like to have a Docker image to reproduce this work very quickly without the burden of installing those packages.

I have made some progress:

FROM nvidia/cuda:10.2-devel-ubuntu18.04 AS build

RUN apt-get update && apt-get install -y --no-install-recommends \
        lsof wget ca-certificates \
        g++-7 && \
    rm -rf /var/lib/apt/lists/*

RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py38_4.9.2-Linux-x86_64.sh -O ~/miniconda.sh && \
    /bin/bash ~/miniconda.sh -b -p /opt/conda && \
    rm ~/miniconda.sh && \
    /opt/conda/bin/conda clean -tipsy && \
    ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
    echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
    echo "conda activate base" >> ~/.bashrc

ENV PATH=/opt/conda/bin:$PATH \
    LANG=C.UTF-8 \
    CXX=g++-7

RUN conda install -c conda-forge conda-pack && \
    conda create -n mink -c pytorch-lts -c conda-forge -c anaconda \
        python=3.8 \
        openblas-devel \
        pytorch torchvision cudatoolkit=10.2 && \
    conda install -c bioconda google-sparsehash && \
    conda clean -ya

RUN apt-get update && apt-get install -y --no-install-recommends \
        git && \
    rm -rf /var/lib/apt/lists/*

ENV TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX"
RUN /opt/conda/envs/mink/bin/pip install --no-deps --no-cache -U git+https://github.com/NVIDIA/MinkowskiEngine -v \
                --install-option="--blas_include_dirs=/opt/conda/envs/mink/include" \
                --install-option="--blas=openblas" \
                --install-option="--force_cuda" \
                --install-option="--cuda_home=/usr/local/cuda"

RUN /opt/conda/envs/mink/bin/pip install hydra-core==1.0.0 \
                                         tensorboardX==2.0 \
                                         scipy==1.5.4 \
                                         scikit-learn==0.23.1 \
                                         plyfile==0.4 \
                                         pandas==1.0.5 \
                                         trimesh==3.7.5 \
                                         imageio==2.8.0 \
                                         hydra-colorlog==1.0.0 \
                                         hydra-submitit-launcher==1.1.0 \
                                         matplotlib==3.2.2 \
                                         opencv-python==4.5.1.48 
WORKDIR /mink
RUN conda-pack -n mink -o /tmp/mink.tar && \
    tar xf /tmp/mink.tar && rm /tmp/mink.tar

RUN /mink/bin/conda-unpack

FROM nvidia/cuda:10.2-devel-ubuntu18.04

ENV CONDA_PREFIX=/mink
ENV PATH=$CONDA_PREFIX/bin:$PATH \
    LANG=C.UTF-8

COPY --from=build $CONDA_PREFIX $CONDA_PREFIX

SHELL ["/bin/bash", "-c"]
RUN source /$CONDA_PREFIX/bin/activate

I am able to import MinkowskiEngine and PyTorch. But I could not find a way to install PointGroup and PointNet++ by using Docker image. It would be very nice if you could release a docker image to reproduce your project.

Or if there is someone who would like to develop this docker image together, feel free to contact me and we could build this image together. Because I really want to learn this state-of-the-art work.

Best regards,
zshyang

Which sampling method does ‘Ours(init)’ use in Figure 6?

The original text is 'Ours (init) denotes the network initialization by our pre-trained model'.

The x-axis is limited labeled points, so which sampling method does 'Ours(init)' use to get these limited labeled points?

Thanks!

multiprocessing and spawn

Hi,

Thank you for open-sourcing your work! It is really neat!

However, I have trouble launching your jobs.

I have to set start method to "spawn" in order to run the launch.sh (torch.multiprocess.set_start_method('spawn')) . Otherwise I got this error:

RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1591914855613/work/aten/src/THC/THCGeneral.cpp:47

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

However, if I do this, I got error in multiprocess_utils about pickle function

_pickle.PicklingError: Can't pickle <function single_proc_run at 0x7fcd513d9170>: attribute lookup single_proc_run on __main__ failed

I checked that this suggests I should use 'fork' instead of 'spawn'

I am using pytorch 1.5.1 (py3.7_cuda10.2.89_cudnn7.6.5_0), and hydra version:

hydra-colorlog            1.0.0                    pypi_0    pypi
hydra-core                1.0.0                    pypi_0    pypi
hydra-submitit-launcher   1.1.0                    pypi_0    pypi

I wonder if you have any idea on how to correctly launch your job?

Thank you!

cuda memory

Hi,
thanks for this great work,
I have a question, since I want to run the training code in my desktop. I use a single 3090 GPU. I saw you use 8 GPUs and set batch size to 32. If I want to run the pre-train code in my GPU, what's the batch size should I set? because I set it to 2 but still out of cuda memory.

Thanks,
zihui

RuntimeError: Could not detect "srun", are you indeed on a slurm cluster?

hello, i have a environment problem, how can i solve? thanks

Installation problem

During installation, there is a no directory cd downstream/semseg/lib/bfs/ops. Though there is a directory in insseg folder. Can you please help me with this?

Semseg on S3DIS with sigle GPU

I try to run './scripts/train_stanford3d.sh' and modify the config to single gpu, without changes to other code. I do not know why I get such error?

Traceback (most recent call last):
File "ddp_main.py", line 232, in cli_main
main(config)
File "ddp_main.py", line 187, in main
train(model, train_data_loader, val_data_loader, config)
File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/train.py", line 107, in train
coords, input, target = data_iter.next()
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/dataset.py", line 277, in getitem
coords, feats, labels, center = self.load_ply(index)
File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/dataset.py", line 71, in wrapper
results = func(self, *args, **kwargs)
File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/datasets/stanford.py", line 158, in load_ply
plydata = PlyData.read(filepath)
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/plyfile.py", line 287, in read
data = PlyData._parse_header(stream)
File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/plyfile.py", line 229, in _parse_header
line = stream.readline().decode('ascii').strip()
AttributeError: 'PosixPath' object has no attribute 'readline'

Train split for limited scene reconstructions

Do you use the first 1%, 5%, 10%, 20% of the data when you fine-tune the network on ScanNet semantic segmentation for limited reconstruction case, or do you randomly shuffle the data first? If so, do you avoid splitting subscenes that are generated in the same scene to train and validation?

when run: sh ./scripts/test_scannet.sh in votenet, the following problems occur.

Thanks for you reply!

Preprocess S3DIS data

Hi, I downloaded the S3DIS data and unzipped the data, but it seems like the file structure of the dataset does not match your preprocessing code. The structure is as follows:

├── ReadMe.txt
├── Stanford3dDataset_v1.2_Aligned_Version.mat
├── Stanford3dDataset_v1.2_Aligned_Version.zip
├── Stanford3dDataset_v1.2.mat
└── Stanford3dDataset_v1.2.zip

Can you share your file structure and give me some suggestions?
Thank you very much!

About pre-trained model for S3DIS

Hi, thanks again for your excellent work and I have a question about the pre-trained model for S3DIS. Since S3DIS adopts a different voxel size than ScanNet, does it share the same pre-trained model with model weight for ScanNet fine-tuning? If not, it would be really helpful if you could share the pre-training details for S3DIS.

Runtime error in the data pre-processing (open3d)

Hi, I am trying to run the data pre-processing on the scanNetv2.
However, when the script (preprocess.sh) running the function pcd.points = o3d.utility.Vector3dVector(xyz) in compute_full_overlapping.py, there is a runtime error without any information. I found that the size of numpy array is (276497, 7)in the pcd file generated by point_cloud_extractor.py.

How should I fix the shape to (n, 3)?

Thank you.

Confuse about compute_partitions_fast in shape context.py

pretrain/contrastive_scene_contexts/lib/shape_context.py Line 89-90

maskUp = rel_trans[:,:,2] > self.r1
maskDown = rel_trans[:,:,2] > self.r1

maskUp and maskDown are same. Is there existing some mistakes?

About the downstream tasks in ShapeNet ?

Hi
Thanks for sharing your nice work.
Have you do the downstream tasks in ShapeNet, like 3D classification and part segmentation? And can you share your evaluation tools on ShapeNet ?
Best

Package import problem

while open ddp_trainer.py file, There are red wavy lines under the imported part of the package

Training data split

Thank you for the interesting work. Would it be possible for you to share the training data split filenames or script for generating for ScanNet and Shapenet?

Thanks in advance,

I want to ask about running my own pointcloud in S3DIS Fine-tuning

Hello, excuse me. I am a novice to 3d semantic segmentation. And it's nice to see your work. I'd like to ask you a few questions.
①I would like to run my own point cloud in the semantic segmentation task of S3DIS Fine-tuning. May I ask whether could I convert my point cloud into the same file structure and format as the Stanford3dDataset_v1.2_Aligned_Version dataset, and then use it as input. In addition, whether the weight of the network (PRETRAIN=PATH_CHECKPOINT) is necessary in the ./scripts/train_stanford3d.sh.
②when I run the code following the README, I encountered problems in ./scripts/train_stanford3d.sh, it reported the following error:

I wonder if you know how to solve it, looking forward to your reply!!

invalid issue

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

while run DATAPATH=/media/ys321/ys/dataset/scannet/test LOG_DIR=./output PRETRAIN=/media/ys321/ys/dataset/scannet/Res16UNet34C.pth ./scripts/test_scannet.sh

The following error occurred
/home/ys321/anaconda3/envs/sparseconv043/lib/python3.7/site-packages/hydra/core/utils.py:143: UserWarning: register_resolver() is deprecated.
See omry/omegaconf#426 for migration instructions.

OmegaConf.register_resolver(name, f)
[2023-02-26 20:14:39,998][root][INFO] - ===> Configurations
Traceback (most recent call last):
File "ddp_main.py", line 31, in main
single_proc_run(config)
File "ddp_main.py", line 19, in single_proc_run
trainer = SegmentationTrainer(config)
File "/home/ys321/ContrastiveSceneContexts/downstream/semseg/lib/ddp_trainer.py", line 43, in init
logging.info(config.pretty())
omegaconf.errors.ConfigAttributeError: Key 'pretty' is not in struct
full_key: pretty
object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Issue on preparing scannet downstream data

Sorry but I can not find path 'downstream/semseg/lib/datasets/preprocessing/scannet' in this repo. I find scannet.py to preprocess data in PointContrast repo, but I get the error ''PosxiPath' object has no attribute 'write''. I do not know what is wrong.

How to pre-process the test split to submit to the Scannet benchmark

Hi, for ScanNet detection data generation, in https://github.com/facebookresearch/votenet/tree/master/scannet, only the preprocessing for train/val data is provided (000-706). But the test data (707-806) also need to be processed for testing on the benchmark. Please provide relevant pre-processing for the test data (707-806). Thanks.

Docker Image and domain specific information

Hi guys,

Thank you for making the codebase available and explaining all the minute details both in the paper and in the github repo. Your work inspires me to pre-train with my own data (around 700,000) scans of the assemblies of different automobile structures and finetune it for downstream tasks (Semantic segentation and object detection). This work seems like the best match to the kind of information I want to capture from a cluttered LiDAR scanned scene.

I need some light in understanding whether it's worth pretraining with my dataset instead of using the Scannet pretrained weights, as there I sense an evidently clear domain mismatch between the indoor scans and the type of scans I am speaking about. Any inputs in this regard will be highly appreciated.

I know it's some additional effort from your side, but can you please make the docker image or docker file available for setting up the environment for the project and reduce the cycle time of the project I'm working on.

I really appreciate your valuable time and wish you a happy Christmas in advance.

@likethesky @Celebio @colesbury @pdollar @minqi

Pre-trained models

Hi, thank you very much for your great codes and detailed explanation! I have some questions about pre-trained models.
I want to use the pre-trained model obtained from unsupervised learning and fine-tune the pre-trained model on my own dataset for semantic segmentation. You provide 'Initialization' and 'Pre-trained Model' for all your experiments. From my understanding, your limited annotation training starts from a pre-trained network, and that means the 'Initialization' should be the pre-trained model. If so, what are 'Pre-trained Model'? Which model should I use for my fine-tuning?
I'm looking forward to your reply.

Data Pre-processing ./preprocess.sh

I have a question, this export SCANNET_DIR=<path_to_downloaded_data>
"path_to_downloaded_data" is for what?

downstream task semseg

Hello,

I have a question on semseg downstream task on the Stanford dataset.

Thanks to provide all the log files and pretrained models.
But Although dir or norm losses seem to be used for the semseg downstream task on the Stanford dataset as shown in your log file, there's no part to produce dir or norm losses in the 'downstream/semseg/lib/ddp_trainer.py, line 270-272'.
To reproduce your work on the Stanford dataset, should we modify dataset.py and ddp_trainer.py to include those loss terms?
(As I checked the ScanNet semseg log file, I found that those dir and norm loss terms are not used, unlike Stanford semseg task)

Thanks in advance.

loss curve?

Hi, I just wonder whether you can share the training loss curve of the project, as I now can run the code, thank you very much.

Scannet Semantic Segmentation Results

Hi,
I have downloaded the pretrained model from model zoo from this link: http://kaldir.vc.in.tum.de/3dsis/contrastive_scene_contexts/pretrain/partition8_4096_100k.pth and trained the model for semantic segmentation on Scannet initialized with pretrained weights. I got 72.7 mIoU. Is this within the standard deviation? I used 8 GPUs with 48 batch size for 20k iterations. I did not change anything in the config file. Here is my log file:
ddp_main.log

Intuition for Contrastive Scene Contexts

Hi! Thank you for your amazing work.

I wanted check if I understood why the Contrastive Scene Contexts approach works much better than PointContrast, especially for more sampled points. Would it be correct to think that using lots of points for PointContrast does not have significant benefits because many of those points are "easy negatives?" Then, separating the scene into partitions and equally weighting each partition more explicitly forces the model to also learn to differentiate against negatives of different types (close/far, relative angles), some of which are harder negatives.
Is this understanding reasonable? Thank you for your time

UnavailableInvalidChannel: The channel is not accessible or is invalid.

Hi thanks for this repo.
I was following the installation however when i go conda install -c bioconda google-sparsehash i got an error
here's the log
Collecting package metadata (current_repodata.json): failed

UnavailableInvalidChannel: The channel is not accessible or is invalid.
channel name: bioconda
channel url: https://conda.anaconda.org/bioconda
error code: 404

You will need to adjust your conda configuration to proceed.
Use conda config --show channels to view your configuration's current state,
and use conda config --show-sources to view config file locations.

question about NMS in instance segmentation

Hi, thanks for your code release. I have a question about the nms code here. It seems that it only removes some proposals predicted in [10, 12, 16] classes, not like a regular NMS where the proposal score is utilized for proposal ranking and the iou overlap between proposals are computed to remove redundant proposals?

S3DIS Semantic Segmentation Training from Scratch

In both PointContrast and ContrastiveSceneContexts papers, semantic segmentation results on S3DIS are stated as 68.2 mIoU. But in MinkowskiNet's GitHub repository(https://github.com/chrischoy/SpatioTemporalSegmentation), they achieve 66.3 mIoU using Mink16UNet34. You are also using Res16UNet34C with 5cm voxel size. When I train the model using my own repository with Res16UNet34C I also get around 66.4 mIoU. Is there anything I am missing? Can you explain how you get +2 mIoU compared to the original Minkowski model? Is it about data augmentation, optimizer etc. ?

Questions about metric and visulization

Firstly thank you for your wonderful work! And I have two questions about the implementation.

I notice that you computed segmentation IoU based on the voxel-wise predictions (the input is voxelized). So are all the metrics reported in the paper computed in this way?
Also, for scannet, I can find the function to pass the predictions to points, but for stanford3d, there seems none such function. So how do you visualize the segmentation results of stanford3d? Maybe visualize based on voxels?

Thank your great work again, and looking forward to replies.

How long it will take to train downstream tasks?

Hi,

thanks for opening source so briliant work!

It said in PointContrast/issues/12 that

It takes around 40 hours to train scripts/ddp_local.sh (60000 iters with batch size 32) on 8 GPUs (like V100)

I am curious about how long it will take to train downstream tasks, such as semantic segmentation？

Best

About instance loss for PointGroup

Hi, thanks for your research; it's really inspired me a lot. I have a question about your reproduced version of PointGroup, since I am trying to add instance segmentation support to my codebase.

The original PointGroup adds an instance loss after a preparing epoch, but the reproduced version of CSC only contains a seg loss and two offset loss, and I am curious about the reason. Does it because the instance loss can not promote the performance?

Visual color selection

Thanks for your excellent work!
I have observed that the visualization colors in your work are very attractive. Could you please introduce your visualization methods and color selection?
Thank you!!

facebookresearch / contrastivescenecontexts Goto Github PK

contrastivescenecontexts's Introduction

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

Environment

Installation

Pre-training on ScanNet

Data Pre-processing

Pre-training

ScanNet Downstream Tasks

Data Pre-Processing

Semantic Segmentation

Model Zoo

Instance Segmentation

Model Zoo

3D Object Detection

Model Zoo

Stanford 3D (S3DIS) Fine-tuning

Data Pre-Processing

Semantic Segmentation

Model Zoo

Instance Segmentation

Model Zoo

SUN-RGBD Fine-tuning

Data Pre-Processing

3D Object Detection

Model Zoo

Citing our paper

License

contrastivescenecontexts's People

Contributors

Stargazers

Watchers

Forkers

contrastivescenecontexts's Issues

Recommend Projects

Recommend Topics

Recommend Org