Giter Club home page Giter Club logo

unsup_mvs's Introduction

Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency

This repository contains the official implementation of Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency presented as an oral at the 3D Scene Understanding for Vision, Graphics, and Robotics workshop, CVPR 2019.

Project Page | Slides | Poster

teaser Our model consumes a collection of calibrated images of a scene from multiple views and produces depth maps for every such view. We show that this depth prediction model can be trained in an unsupervised manner using our robust photo consistency loss. The predicted depth maps are then fused together into a consistent 3D reconstruction which closely resembles and often improves upon the sensor scanned model. Left to Right: Input images, predicted depth maps, our fused 3D reconstruction, ground truth 3D scan.

Abstract

We present a learning based approach for multi-view stereopsis (MVS). While current deep MVS methods achieve impressive results, they crucially rely on ground-truth 3D training data, and acquisition of such precise 3D geometry for supervision is a major hurdle. Our framework instead leverages photometric consistency between multiple views as supervisory signal for learning depth prediction in a wide baseline MVS setup. However, naively applying photo consistency constraints is undesirable due to occlusion and lighting changes across views. To overcome this, we propose a robust loss formulation that: a) enforces first order consistency and b) for each point, selectively enforces consistency with some views, thus implicitly handling occlusions. We demonstrate our ability to learn MVS without 3D supervision using a real dataset, and show that each component of our proposed robust loss results in a significant improvement. We qualitatively observe that our reconstructions are often more complete than the acquired ground truth, further showing the merits of this approach. Lastly, our learned model generalizes to novel settings, and our approach allows adaptation of existing CNNs to datasets without ground-truth 3D by unsupervised finetuning.

Summary

Intuition

For a set of images of a scene, a given point in a source image may not be visible across all other views.

loss

Implementation

The predicted depth map from the network, along with the reference image are used to warp and calculate a loss map for each of M non-reference neighboring views. These M loss maps are then concatenated into a volume of dimension H × W × M, where H and W are the image dimensions. This volume is used to perform a pixel-wise selection to pick the K “best” (lowest loss) values, along the 3rd dimension of the volume (i.e. over the M loss maps), using which we take the mean to compute our robust photometric loss.

loss

Results

Pre-trained model weights and outputs (depth maps, point clouds, 3D ply files) for the DTU dataset can be downloaded from Google Drive.

results

Installation

To check out the source code:

git clone https://github.com/tejaskhot/unsup_mvs/

Install CUDA 9.0, CUDNN 7.0 and Python 2.7. Please note that this code has not been tested with other versions and may likely need changes for running with different versions of libraries. This code is also not optimized for multi-GPU execution.

Recommended conda environment:

conda create -n mvs python=2.7 pip
conda activate mvs
pip install -r requirements.txt

Parts of the code for this project are borrowed and modified from the excellent MVSNet repository. Please follow instructions detailed there for downloading and processing data, and for installing fusibile which is used for fusion of the depth maps into a point cloud.

Training

  • Download the preprocessed DTU training data.
  • Create directories for saving logs, model checkpoints and intermediate outputs.
  • Enter the code/unsup_mvsnet/ folder.
  • Train the model on DTU by using appropriate paths as defined by the flags.
python train_dtu.py --dtu_data_root <YOUR_PATH> --log_dir <YOUR_PATH> --save_dir <YOUR_PATH> --save_op_dir <YOUR_PATH>

Note:

  • Specify dtu_data_root to be the folder where you downloaded the training data. If the data was downloaded to MVS folder, then the path here will be MVS/mvs_training/dtu/.
  • Specify log_dir, save_dir and save_op_dir to the corresponding directories that you created above for saving logs, model checkpoints and intermediate outputs.
  • For training the model with higher depth resolution, note that you have to change both max_d and interval_scale values inversely i.e. 2 * max_d --> 0.5 * interval_scale

Testing

  • Download the preprocessed DTU testing data.
  • Enter the code/unsup_mvsnet/ folder.
  • In order to generate the depth predictions from the model, run the test script with appropriate paths.
python test_dtu.py --dense_folder <YOUR_PATH> --output_folder <YOUR_PATH> --model_dir <YOUR_PATH> --ckpt_step 45000

Note:

  • Specify dense_folder to be the folder where you downloaded the testing data. If the data was downloaded to MVS folder, then the path here will be MVS/mvs_testing/dtu/.
  • Specify output_folder to be the folder where you would like to store the generated depth map outputs.
  • Specify model_folder to be the folder where the model checkpoints are stored.
  • You can specify ckpt_step for the checkpoint step at which you want to test the model. Default is 45000.
  • Next, install fusibile as follows:
    • Check out the modified version fusibile with git clone https://github.com/YoYo000/fusibile
    • Enter the clone repository folder.
    • Install fusibile by cmake . and make, which will generate the executable at FUSIBILE_EXE_PATH which is same as the clone repository.
    • Next, we have to perform post processing in the form of depth fusion to create the final 3D point cloud. In the same folder, run the depth fusionn script as shown below.
python depthfusion_dtu.py --model_folder <YOUR_PATH> --image_folder <YOUR_PATH> --cam_folder <YOUR_PATH> --fusibile_exe_path <YOUR_PATH>

Note:

  • Specify model_folder to be the folder where the model checkpoints are stored.
  • Specify image_folder and cam_folder to be the same path as that for dense_folder above.
  • fusibile_exe_path is the FUSIBILE_EXE_PATH.
  • The final point cloud is stored in MVS/mvs_testing/dtu/points_mvsnet/consistencyCheck-TIME/final3d_model.ply.
  • The final ply files can be visualized in an application such as MeshLab.

Citation

If you find this work useful, please cite the paper.

@article{khot2019learning,
  title={Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency},
  author={Khot, Tejas and Agrawal, Shubham and Tulsiani, Shubham and Mertz, Christoph and Lucey, Simon and Hebert, Martial},
  journal={arXiv preprint arXiv:1905.02706},
  year={2019}
}

Please consider also citing MVSNet if you found the code useful.

unsup_mvs's People

Contributors

tejaskhot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unsup_mvs's Issues

how to use multi-gpu

hello author,I try to train the model.And I notice there is a variant nums-of-gpu. And I want to use 2 gpu to train. I change the number but it can't work.So I want to know how to use it.Thank you very much

Failed to load model

Hello,

I followed the instructions in the README to install dependencies and download pretrained model. However, when I run the script to test on the DTU dataset. I got this error message:

2020-09-28 10:26:57.000521: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key 3dconv0_1/batch_normalization/moving_mean not found in checkpoint

I suspect that the pretrained model doesn't load is because you updated the code somewhere. Can you help?

Open source of the code

Thank you for your summit of this work!
But as it is noted that the project will open source at nearly 12 months before, when will you decide to provide the training and testing script?
Many thanks!

The training speed is very slow

Hi, Tejas:
I try to train this excellent work with 1 titan xp GPU. But the speed seem to very slow. The training time of one step is about 1.3 sec. I want to know why this happened. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.