Giter Club home page Giter Club logo

deep_blind_pnp's Introduction

Learning 2D–3D Correspondences To Solve The Blind Perspective-n-Point Problem

This contains the datasets and codes for training the deep blind PnP method described in : Learning 2D–3D Correspondences To Solve The Blind Perspective-n-Point Problem. It also serves as the foundation for method described in : Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization (ECCV2020, Oral).

Contribution in Sentences

Given two sets (one 3D and one 2D this paper), and you want to estimate matches. 1) extract point-wise features; 2) estimate a joint probability matrix; 3) take the top-K matches from the probability matrix.

alt text

Datasets

Please download synthetic ModelNet40, NYU-RGBD and real-world MegaDepth datasets.

Codes and Models

Prerequisites

Pytorch=1.1.0 : (This is the version on my PC, but I think it also works on yours)

numpy

opencv

tensorboardX

easydict

logging

json

If you find missing Prerequisites, please Google and install them using conda or pip

Overview

Our model is implemented both in Tensorflow and Pytorch. Currently, only Pytorch codes are uploaded. All our models are trained from scratch, so please run the training codes to obtain models.

For pre-trained models, please refer to PreTrainModel folder. Under the folder of each dataset, there is a folder named preTrained and you can find it there.

Training

Run:

python main_blindPnP.py

Before run the script, please modify the configurations directly in the script or in config.py, specifically:

You need to select the used dataset (configs.dataset) and change the directory of the dataset (configs.data_dir)

Testing

Run:

python main_test.py

If you have questions, please first refer to comments in scripts.

Modifications

  1. Learning rate is set to 1e-3 for MegaDepth dataset and 1e-4 for ModelNet40 and NYU-RGBD. I found better performance can be obtained by just simply changing the learning rate one day, instead of using 1e-5 as reported in the paper.

  2. If you want to use a classification network [Section 3.4 Correspondence Set Refinement] to polish the 3D-2D matches, please modify the network proposed in Learning to Find Good Correspondences following the guideline in tf_pose_loss.py

Q/A

  1. Can this paper be used in partial-to-partial matching? Yes.
  2. Can this paper be used in a real-world localization scenario? Yes.

I have done experiments to verify the above answers.

Publications

If you like, you can cite the following publication:

Liu, Liu, Dylan Campbell, Hongdong Li, Dingfu Zhou, Xibin Song, and Ruigang Yang. "Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem." arXiv preprint arXiv:2003.06752 (2020).

@article{liu2020learning,
  title={Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem},
  author={Liu, Liu and Campbell, Dylan and Li, Hongdong and Zhou, Dingfu and Song, Xibin and Yang, Ruigang},
  journal={arXiv preprint arXiv:2003.06752},
  year={2020}
}

And also the following publication:

Dylan Campbell *, Liu, Liu * and Stephen Gould. "Solving the Blind Perspective-n-Point Problem End-To-End with Robust Differentiable Geometric Optimization." In Proceedings of the European Conference on Computer Vision (ECCV), 2020. (* indicates equal contribution)

@inproceedings{CampbellAndLiu:ECCV2020,
  author = {Dylan Campbell$^\ast$ and Liu Liu$^\ast$ and Stephen Gould},
  title = {Solving the Blind Perspective-n-Point Problem End-To-End with Robust Differentiable Geometric Optimization},
  booktitle = {ECCV},
  year = {2020},
  note = {$^\ast$ equal contribution},
}

Contact

If you have any questions (NOT those you can find answers via Google), drop me an email ([email protected])

deep_blind_pnp's People

Contributors

liumouliu avatar panpanfei avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.