Giter Club home page Giter Club logo

stereo-odometry-soft's Introduction

Stereo-Odometry-SOFT

This repository is a MATLAB implementation of the Stereo Odometry based on careful Feature selection and Tracking. The code is released under MIT License.

The code has been tested on MATLAB R2018a and depends on the following toolboxes:

  • Parallel Processing Toolbox
  • Computer Vision Toolbox

On a laptop with Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz and 16GB RAM, the following average timings were observed:

  • Time taken for feature processing (in ms): 261.4
  • Time taken for feature matching (in ms): 3650.5
  • Time taken for feature selection (in ms): 6.5
  • Time taken for motion estimation (in ms): 1.1

How to run the repository?

  1. Clone the repository using the following command:
git clone https://github.com/Mayankm96/Stereo-Odometry-SOFT.git
  1. Import the dataset to the folder data. In case you wish to use the KITTI Dataset, such as the Residential dataset, the following command might be useful:
cd PATH/TO/Stereo-Odometry-SOFT
## For Reseidential Sequence: 61 (2011_09_46_drive_0061)
# synced+rectified data
wget -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0009/2011_09_26_drive_0009_sync.zip -P data
# calib.txt
wget -c https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_calib.zip -P data
  1. Change the corresponding paramters in the configuration file configFile.m according to your need

  2. Run the script main.m to get a plot of the estimated odometry

Proposed Implementation of the Algorithm

The input to the algorithm at each time frame, are the left and right images at the current time instant, and the ones at the previous timestamp.

Keypoints Detection

In this section, the keypoints detection and matching is divided into following separate stages:

  • feature processing: each image is searched for locations that are likely to match well in other images
  • feature matching: efficiently searching for likely matching candidates in other images
  • feature tracking: unlike to the second stage, the correspondences are searched in a small neighborhood around each detected feature and across frames at different time steps

Feature processing

Corner and blob features are extracted for each image using the following steps:

  1. First, blob and checkerboard kernels are convolved over the input image.

  1. Efficient Non-Maximum Suppression is applied on the filter responses to produce keypoints that may belong to either of the following classes: blob maxima, blob minima, corner maxima, and corner minima. To speed up the feature matching, correspondences are only found between features belong to the same class.

  2. The feature descriptors are constructed by using a set of 16 locations out of an 11 x 11 block around each keypoint in input image's gradients. The gradient images are computed by convolving 5 x 5 sobel kernels across the input image. The descriptor has a total length of 32,

Feature matching

This part of the algorithm is concerned with finding the features for egomotion estimation. It is based on the process mentioned in the paper StereoScan: Dense 3d reconstruction in real-time. The process can be summarized as follows:

  1. Correspondences in two images are found by computing the Sum of Absolute Differences (SAD) score between a feature in the first image with the one lying in the second image that belongs to the same class

  2. This matching is done in a circular fasion between the left and right frames at time instants t-1 and t as shown below:

  1. To remove certain outliers, Normalized Cross-Correlation (NCC) is again in a circular fasion using templates of size 21 x 21 pixels around the features that have been matched successfully in the process mentioned above.

Feature Selection

To ensure a uniform distribution of features across the image, the entire image is divided into buckets of size 50 x 50 pixels and feature selection is done to select only the strongest features present in each bucket.

Egomotion Estimation

Using P3P algorithm along with RANSAC, incremental rotation and translation is estimated.

To-Dos

  • fix parfor and for loops to enable running without parallel processing
  • read the camera calibration parameters from calibration file directly
  • add sub-pixel refinement using parabolic fitting
  • add feature selection based on feature tracking i.e. the age of features
  • implement Nister's algorithm and SLERP for rotation estimation
  • use Gauss-Newton optimization to estimate translation from weighted reprojection error

stereo-odometry-soft's People

Contributors

mayankm96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stereo-odometry-soft's Issues

global pose integration

Hi, you have get the camera orientation and location in world coordinate. Which are actrually relative orientation and location between frame t and frame t-1 .
So you have used
pos = pos + Rpos*tr';
Rpos = R * Rpos;
to get orientation and location in global relative to the original pos. But it dosn't show me a correct trajectory.

Can I use the R and tr as relative pos, which can be added to posegraph3D ?
Like hier
relativePose = [tr,rotm2quat(R)];
addRelativePose(posegraph3D ,relativePos, informationmatrix);
And then do some further loopclosure.
Are you familiar with the posegraph3D? Can i directly use the R and t to construct relative camera pose?

I also need to get poses in form of [x y z qw qx qy qz] to build an occupancy map. What should I do to the R and tr. I really get confused for a long time.

Could you please explain it to me? Thank you very much.

SOFT paper reference

Hi @Mayankm96,

First of all thanks for contributing the code as opensource, as it is very helpful for research purposes.

I want to know if it is complete implementation of SOFT and were you able to get the same error rates as reported in the SOFT paper ?

Please let me know , its urgent. I am currently implementing in C++. But I could not get the same results, they differ a lot. And therefore I am referring to your implementation.

I doubt if they have given less details in the paper, or I am missing something.

Regards
Gaurav

Problems with the code base.

You have a few problems with this code:

one your code makes it seem that the location of a feature point is the same in each stereo pair this leads to Matlab thinking that no translation exists and so print an error message to the console reading that the "." operation is not valid for an empty object.

Two you remake the entire graph at each time instance making it really slow.

Three there's no tracking of results.

four there's no way to turn bucketing on and off easily

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.