Giter Club home page Giter Club logo

Comments (8)

BorisLerouxFox avatar BorisLerouxFox commented on August 19, 2024

Ok, I go a little bit further on my research, the second error seems to come from the FAST algortihm, I changed the fast_treshold from 20 to 10, and it works until the end of the sequence.
Actually, I didn't find any documentation about this fast_treshold and what it represent. Is it the number of points ?

Now I'm running an error about the scale, but I know to solve it, need to convert geographic coordinates into cartesian ones, do I'm right ?
Best regards
Boris

from mono-vo.

BJMH avatar BJMH commented on August 19, 2024

Hi Boris,
The KITTI dataset that is being used appears to be the colour one from here (KITTI website). It is sequence 00. You also need the 00.txt which is in a different folder.

For your first error it looks like the colour images are not being loaded at all. That is most likely due to a misspelling of the filename. The second error is because you have a grey image but you are trying to convert it from colour to grey (it's already is grey). If you want to use grey scale images, you will need to remove all the lines that call cvtColor(..., COLOR_BGR2GRAY);

According to the OpenCV documentation, the threshold parameter is a "threshold on difference between intensity of the central pixel and pixels of a circle around this pixel." I'm not really sure if that's a minimum threshold or a maximum threshold, but I do know that FAST features work best around areas with sharp and distinctive changes in intensity. That is good gradients around corners.

Scale errors come from the fact the monocular VO cannot determine scene scale. A scene that is twice as large and twice as far away will look the same to a single camera, so this code uses information from the 00.txt ground truth data file to change the scale of estimates. If you are running this and not getting correct scale, you might be using a file in a different format. Try to figure out which parts of each line in your ground truth data relate to the x, y, and z translations, then change getAbsoluteScale(...) to use that instead.

from mono-vo.

JonnySme avatar JonnySme commented on August 19, 2024

@BJMH Hello. Can you explain to me how to use your own dataset?

from mono-vo.

BJMH avatar BJMH commented on August 19, 2024

@JonnySme If your dataset is a sequence of image files then you need to change lines 85, 86, and 139 to load your images instead. Also you need to change getAbsoluteScale() to load your ground truth data. It currently expects 12 numbers per line with the 4th, 8th, and 12th being the x, y, and z translation. You can remove the call to that function if you don't have a ground truth, but then each frame-to-frame translation will be estimated as 1 unit long.

from mono-vo.

JonnySme avatar JonnySme commented on August 19, 2024

@BJMH Thank you for your answer!
how can i get ground truth data myself? If ground truth data is missing, then the end result will be very different from the truth?

i tried to remove "scale = getAbsoluteScale(numFrame, 0, t.at(2));" function, no xyz builds after deletion

Thank you for your answer!

from mono-vo.

BJMH avatar BJMH commented on August 19, 2024

@JonnySme It depends on your dataset. If it's video that you recorded yourself then you will have to measure it. If you downloaded a dataset it should come with its own GT.

This type of VO cannot determine the magnitude of any translations, nor does it have a consistent scale between measurements. If you think about it, any object that the camera sees would appear the same in the video if it were twice as large and twice as far away. Because of that every frame-to-frame translation is normalised so you need to scale it by the correct magnitude from a GT.

Without the GT, your rotations and the direction of translation are still correct. Only the magnitude of the translations are incorrect, and also inconsistent. To made them consistent you would need to add in some extra alignment across multiple frames with Structure from Motion and/or Bundle Adjustment. Both of those are pretty deep rabbit holes, and I don't know of any easy to digest code examples floating around online.

from mono-vo.

JonnySme avatar JonnySme commented on August 19, 2024

@BJMH
Thank you!
I have a set of frames taken on a camera from 22 frames per second. How can I get a GT on this dataset?

from mono-vo.

BJMH avatar BJMH commented on August 19, 2024

@JonnySme If it's video you've taken yourself you'll have to measure it while taking the footage. Perhaps with GPS, an IMU, Vive Trackers, having calibration checkerboard present in all frames, or some other method.

from mono-vo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.