Giter Club home page Giter Club logo

magnet's Introduction

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Official implementation of the paper

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

CVPR 2022 [oral]

Gwangbin Bae, Ignas Budvytis, and Roberto Cipolla

[arXiv] [openaccess] [oral presentation]

We present MaGNet (Monocular and Geometric Network), a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects.

Datasets

We evaluated MaGNet on ScanNet, 7-Scenes and KITTI

ScanNet

  • In order to download ScanNet, you should submit an agreement to the Terms of Use. Please follow the instructions in this link.
  • The folder should be organized as

/path/to/ScanNet
/path/to/ScanNet/scans
/path/to/ScanNet/scans/scene0000_00 ...
/path/to/ScanNet/scans_test
/path/to/ScanNet/scans_test/scene0707_00 ...

7-Scenes

  • Download all seven scenes (Chess, Fire, Heads, Office, Pumpkin, RedKitchen, Stairs) from this link.
  • The folder should be organized as:

/path/to/SevenScenes
/path/to/SevenScenes/chess ...

KITTI

  • Download raw data from this link.
  • Download depth maps from this link
  • The folder should be organized as:

/path/to/KITTI
/path/to/KITTI/rawdata
/path/to/KITTI/rawdata/2011_09_26 ...
/path/to/KITTI/train
/path/to/KITTI/train/2011_09_26_drive_0001_sync ...
/path/to/KITTI/val
/path/to/KITTI/val/2011_09_26_drive_0002_sync ...

Download model weights

Download model weights by

python ckpts/download.py

If some files are not downloaded properly, download them manually from this link and place the files under ./ckpts.

Install dependencies

We recommend using a virtual environment.

python3.6 -m venv --system-site-packages ./venv
source ./venv/bin/activate

Install the necessary dependencies by

python3.6 -m pip install -r requirements.txt

Test scripts

If you wish to evaluate the accuracy of our D-Net (single-view), run

python test_DNet.py ./test_scripts/dnet/scannet.txt
python test_DNet.py ./test_scripts/dnet/7scenes.txt
python test_DNet.py ./test_scripts/dnet/kitti_eigen.txt
python test_DNet.py ./test_scripts/dnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.1186 0.2070 0.0493 0.2708 0.1461 0.1086 0.0515 10.0098 0.8546 0.9703 0.9928 2.2352
7-Scenes 0.1339 0.2209 0.0549 0.2932 0.1677 0.1165 0.0566 12.8807 0.8308 0.9716 0.9948 2.7941
KITTI (eigen) 0.0605 1.1331 0.2086 2.4215 0.0921 0.0075 0.0261 8.4312 0.9602 0.9946 0.9989 2.6443
KITTI (official) 0.0629 1.1682 0.2541 2.4708 0.1021 0.0080 0.0270 9.5752 0.9581 0.9905 0.9971 1.7810

In order to evaluate the accuracy of the full pipeline (multi-view), run

python test_MaGNet.py ./test_scripts/magnet/scannet.txt
python test_MaGNet.py ./test_scripts/magnet/7scenes.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_eigen.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.0810 0.1466 0.0302 0.2098 0.1101 0.1055 0.0351 8.7686 0.9298 0.9835 0.9946 0.1454
7-Scenes 0.1257 0.2133 0.0552 0.2957 0.1639 0.1782 0.0527 13.6210 0.8552 0.9715 0.9935 1.5605
KITTI (eigen) 0.0535 0.9995 0.1623 2.1584 0.0826 0.0566 0.0235 7.4645 0.9714 0.9958 0.9990 1.8053
KITTI (official) 0.0503 0.9135 0.1667 1.9707 0.0848 0.2423 0.0219 7.9451 0.9769 0.9941 0.9979 1.4750

Training scripts

If you wish to train the models, run

python train_DNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt
python train_FNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt
python train_MaGNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt

Note that the dataset_path argument in the script .txt files should be modified

Citation

If you find our work useful in your research please consider citing our paper:

@InProceedings{Bae2022,
  title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry}
  author = {Gwangbin Bae and Ignas Budvytis and Roberto Cipolla},
  booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}                         
}

magnet's People

Contributors

baegwangbin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.