Giter Club home page Giter Club logo

sparse_rcnn's Introduction

Sparse Convolutions for Semantic 3D Instance Segmentation

Abstract

3D object detection and segmentation are crucial for various domains and applications. However, transferring 2D image techniques to 3D data is still challenging because of the massive amount of data contained in 3D voxel grids. We present an architecture, which combines the principle of object detection and segmentation used by Mask R-CNN for 2D images with the computational efficiency of Sparse Submanifold Convolutions on sparse 3D voxel grids. The network consists of a Region Proposal Network to predict bounding boxes and both a Class Network and a Mask Network which rely on the region proposals. We show how parts of the feature extractor, the Class Network and the Mask Network can be rendered sparse. A sparse feature extractor reduces the amount of required computation while keeping similar detection performance. A sparse Mask Network enables to process masks of different shapes batch-wise without resizing and loosing spatial correspondence information. Furthermore, we propose a solution to find the best density of anchors by using anchor-wise anisotropic anchor densities with respect to each anchor’s shape. Our model proves that the Mask R-CNN based 3D model can achieve both state-of-the-art object detection and instance segmentation performance.

Results

The method has been evaluated on the Scannet Benchmark

Getting Started

Prerequirements

This setup is tested on Ubuntu 18.04 with CUDA 10.1. Furthermore it requires Anaconda to be installed.

Installation

  1. Download this git repository
git clone [email protected]:LeonhardFeiner/sparse_rcnn.git
  1. create an Anaconda environment using the environment file of this repo
cd sparse_rcnn/
conda env create -f environment.yml
conda activate py38_pt14_scn
  1. Download the SparseConvNet repository
cd ..
git clone [email protected]:facebookresearch/SparseConvNet.git
  1. Install SparseConvNet
cd SparseConvNet/
bash develop.sh
  1. Install Meshlab for data preparation

Dataset

  1. Download the Scannet Dataset

Required Files:

  • _vh_clean.aggregation.json
  • _vh_clean_2.ply
  • _vh_clean_2.0.010000.segs.json

Optional high resolution data (not recommended):

  • _vh_clean.ply
  • _vh_clean.segs.json
  1. Adapt the paths in scannet_config\pathes.py
  2. Correct error in "scene0217_00" by running
python preparation\0_raw_data_error_correction.py
  1. Create label mapper by running
python preparation\1_sparse_label_map.py
  1. Create the instance association tensors by running
python preparation\2_sparse_instance_association.py
  1. Calculate vertex normals using meshlab by running
python preparation\3_sparse_normals.py
  1. Calculate the dataset by
python preparation\4_sparse_create_data.py
  1. Precalculate augmented data (not required and not recommended)
python preparation\5_precalculate_dataset.py

Start Training

Parameters of the network can be adapted here:

scannet_config\run.py

The training can be started with

python main.py

Acknowledgments

I'd like to thank Prof. Dr. Matthias Nießner for sharing his ideas and supervising my work.

sparse_rcnn's People

Contributors

leonhardfeiner avatar

Stargazers

Zhiqi Hu avatar  avatar  avatar yuanhuizhen avatar Jinlai Zhang avatar Cuda Chen avatar dxjforyou avatar Zhe Liu avatar tjnuwjm avatar Alex Ziller avatar

Watchers

 avatar IWANTTOBEYOUNG avatar paper2code - bot avatar

sparse_rcnn's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.