Giter Club home page Giter Club logo

mvster's Introduction

MVSTER

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo arXiv

This repository contains the official implementation of the paper: "MVSTER: Epipolar Transformer for Efficient Multi-View Stereo".

Introduction

MVSTER is a learning-based MVS method which achieves competitive reconstruction performance with significantly higher efficiency. MVSTER leverages the proposed epipolar Transformer to learn both 2D semantics and 3D spatial associations efficiently. Specifically, the epipolar Transformer utilizes a detachable monocular depth estimator to enhance 2D semantics and uses cross-attention to construct data-dependent 3D associations along epipolar line. Additionally, MVSTER is built in a cascade structure, where entropy-regularized optimal transport is leveraged to propagate finer depth estimations in each stage.

Installation

MVSTER is tested on:

  • python 3.7
  • CUDA 11.1

Requirements

pip install -r requirements.txt

Training

├── Cameras    
├── Depths
├── Depths_raw   
├── Rectified
├── Rectified_raw (Optional)                                      

In scripts/train_dtu.sh, set DTU_TRAINING as $DTU_TRAINING

Train MVSTER (Multi-GPU training):

  • Train with middle size (512x640):
bash ./scripts/train_dtu.sh mid exp_name
  • Train with raw size (1200x1600):
bash ./scripts/train_dtu.sh raw exp_name

After training, you will get model checkpoints in ./checkpoints/dtu/exp_name.

Testing

  • Download the preprocessed test data DTU testing data (from Original MVSNet) and unzip it as the $DTU_TESTPATH folder, which should contain one cams folder, one images folder and one pair.txt file.
  • In scripts/test_dtu.sh, set DTU_TESTPATH as $DTU_TESTPATH.
  • The DTU_CKPT_FILE is automatically set as your pretrained checkpoint file, you also can download my pretrained model.
  • Test with middle size:
bash ./scripts/test_dtu.sh mid exp_name
  • Test with raw size:
bash ./scripts/test_dtu.sh raw exp_name
  • Test with provided pretrained model:
bash scripts/test_dtu.sh mid benchmark --loadckpt PATH_TO_CKPT_FILE

After testing, you will get reconstructed point clouds of DTU test set in ./outputs/dtu/exp_name.

Metric

  • For quantitative evaluation, download SampleSet and Points from DTU's website. Unzip them and place Points folder in SampleSet/MVS Data/. The structure looks like:
SampleSet
├──MVS Data
      └──Points
  • For convinience evaluation, please install matlab (tested on Ubuntu 18.04) and uncomment mrun_rst function at the end of ./test_mvs4.py, and you also need to change the path of matlab excutable file (for me, it is /mnt/cfs/algorithm/xiaofeng.wang/jeff/code/MVS/misc/matlab/bin/matlab). Then you can evaluate point cloud reconstruction results when testing is finished.

  • You can also evaluate the metrics with the traditional steps: In evaluations/dtu/BaseEvalMain_web.m, set dataPath as the path to SampleSet/MVS Data/, plyPath as directory that stores the reconstructed point clouds and resultsPath as directory to store the evaluation results. Then run evaluations/dtu/BaseEvalMain_web.m in matlab.

Results on DTU (single RTX 3090)

Acc. Comp. Overall. Inf. Time
MVSTER (mid size) 0.350 0.276 0.313 0.09s
MVSTER (raw size) 0.340 0.266 0.303 0.17s

Point cloud results on DTU, Tanks and Temples, ETH3D

If you find this project useful for your research, please cite:

@misc{wang2022mvster,
      title={MVSTER: Epipolar Transformer for Efficient Multi-View Stereo}, 
      author={Xiaofeng Wang, Zheng Zhu, Fangbo Qin, Yun Ye, Guan Huang, Xu Chi, Yijia He and Xingang Wang},
      journal={arXiv preprint arXiv:2204.07346},
      year={2022}
}

Acknowledgements

Our work is partially baed on these opening source work: MVSNet, MVSNet-pytorch, cascade-stereo, PatchmatchNet.

We appreciate their contributions to the MVS community.

mvster's People

Contributors

jeffwang987 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.