Giter Club home page Giter Club logo

s2net's Introduction

S2Net: Accurate Panorama Depth Estimation on Spherical Surface

This repository contains:

  • A pytorch implementation of paper S2Net: Accurate Panorama Depth Estimation on Spherical Surface in arxiv.
  • Dockerfile and scripts for evaluation on Matterport3D dataset.

This repository is licensed under MIT.

We use some codes from Pano3D project which is also licensed under MIT.

Bibtex

If you find this code useful in your research, please cite:

@article{li2023mathcal,
  title={S2Net: Accurate Panorama Depth Estimation on Spherical Surface},
  author={Li, Meng and Wang, Senbo and Yuan, Weihao and Shen, Weichao and Sheng, Zhe and Dong, Zilong},
  journal={IEEE Robotics and Automation Letters},
  volume={8},
  number={2},
  pages={1053--1060},
  year={2023},
  publisher={IEEE}
}

Requirements

We recommend you to use dockerfile provided in docker folder which uses CUDA 10.2 and pytorch 1.8.2 and tested on Tesla V100 32G. For users who want to use newer versions of CUDA or using new nvidia GPU such as RTX3090, we recommend you to use dockerfile in docker/cuda11 folder, which uses CUDA 11.1. We provide exampled docker script in run_docker.sh.

For users prefer configuration step by step, we recommend the following steps on Debian-like distributions:

  • Install conda and setup conda env. Currently, the code is tested on python3.7:
conda create --name s2net python=3.7 
conda activate s2net
  • Install torch, we use pytorch 1.8.2 and cuda 10.2 as an example:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
  • Install some extra dependencies using apt-get, most of them may have probably been already installed:
sudo apt-get install build-essential curl git \
vim wget zip \
bzip2 cmake \
libopenexr-dev \
libeigen3-dev \
apt-utils \
zlib1g-dev \
libncurses5-dev \
libgdbm-dev \
libnss3-dev \
libssl-dev \
libreadline-dev \
libffi-dev \
graphviz
  • Install nvidia apex, we recommend following official installation steps.
  • Install extra python dependencies.
pip3 install scipy Pillow tensorboardX tqdm matplotlib pyquaternion pyrender opencv-python timm einops visdom plyfile openexr pytest
pip3 install attrdict tensorboard open3d numba pyyaml attrdict termcolor scikit-image healpy yacs h5py
git clone https://github.com/meder411/MappedConvolutions.git 
cd MappedConvolutions 
cd package 
python setup.py install

FAQ

  • Key update failure for nvidia-docker (errors are possibly like "NO_PUBKEY A4B469963BF863CC"): please refer to this blog.

Usage

Before training and testing please first set OMP_NUM_THREADS, we recommend 8:

export OMP_NUM_THREADS=8

Training

For training on Matterport3d dataset using multiple Tesla V100 32G cards (we take 2 cards as an example) on pytorch 1.8.2:

python3 -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank 0 run_sp_monodepth_train.py \
-d /path/to/Matterport3D \
-i /path/to/Matterport3D/matterport3d_train_file_list.txt \
-v /path/to/Matterport3D/eval_file_list.txt \
-o /path/to/output_folder \
-c /path/to/config/train_cfg_2cards.yaml \
-p /path/to/swin_backbone/swin_base_patch4_window7_224_22k.pth

We've tested the code on a maximum of 4 Tesla V100 32G cards. The exampled config file is provided in config folder. Note that learning rate and batch size in config file needs to be modified based on your specific hardware environments.

The Swin backbone can be pre-downloaded here.

Testing

For testing please use:

CUDA_VISIBLE_DEVICES=0 python3 run_sp_monodepth_infer_eval.py --eval_depth_map \
-d /path/to/Matterport3D \
-i /path/to/Matterport3D/matterport3d_test_file_list.txt \
-o /path/to/where_your_want_to_place_test_results/ \
-c /path/to/config/test.yaml \
-m /path/to/models/trained_checkpoint.pth

Note that testing can only be done on a single card (we've tested the code on single Tesla V100 32G and Tesla P100 16G), and batch_size must be set to 1.

Results

FAQ

  • Versions higher than pytorch 1.8.2: we use torch.distributed.launch for multi-card training but this has been substituted to torch.distributed.run or torchrun in newer versions of pytorch. We found that training may fail to start on these new versions using torch.distributed.launch, although it is still kept back to maintain compatibility. If you must stick to new versions of pytorch, we suggest you to use torchrun or torch.distributed.run as a substitute, but you need to modify codes about local_rank in run_sp_monodepth_train.py, and some unexpected behaviour may occur as we have not fully tested newer versions of pytorch. See pytorch docs for more information.

Results

Results on Matterprot3d and Stanford2D3D.

M3D S2D3D

Results on Pano3d. We perform an evaluation on $1024 \times 512$ resolution compared with the baseline methods ${\mathrm{UNet}}^{vnl}$ and ${\mathrm{ResNet}}^{comp}_{skip}$ provided by Pano3D.

Pano3D

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.