Giter Club home page Giter Club logo

Comments (5)

daniyar-niantic avatar daniyar-niantic commented on September 7, 2024 3

@wuyujack the post-rectification projection matrix has identity rotation matrix. So, projection matrix can be decomposed into intrinsics and extrinsics matrix, where the rotation component of the extrinsics matrix is identity.

In general, you need to do calibration to estimate intrinsics and extrinsics. Then you can rectify these matrices for the two views and compute projection matrices in a new space where the optical axes of the two cameras are parallel to each other (i.e. the rotation matrix is identity). This OpenCV function does the rectification:
https://docs.opencv.org/3.3.1/d9/d0c/group__calib3d.html#ga617b1685d4059c6040827800e72ad2b6

from monodepth2.

wuyujack avatar wuyujack commented on September 7, 2024 1

This makes me confusing. Based on what I know, the projection matrix should be composited by the camera intrinsic matrix and extrinsic matrix, but based on your reply it seems like you are using the projection matrix directly as the intrinsic matrix? Is it a default usage for the Kitti dataset? I am also verifying this since now I only obtain the camera intrinsic matrix based on the focal length and central points and I wonder whether I still need to get the extrinsic matrix which is composited by the translation vector and the rotation matrix.

from monodepth2.

mrharicot avatar mrharicot commented on September 7, 2024 1

All our code actually assumes a single intrinsics so you just need to change self.K
Just beware that the flip augmentation assumes the principal point to be in the centre of the image.

from monodepth2.

mdfirman avatar mdfirman commented on September 7, 2024

@mrharicot will be able to correct me if I'm wrong here, but I think the KITTI intrinsics you should be looking at are the post-rectification matrices. These are P_rect_02 and P_rect_03 in calib_cam_to_cam.txt. The maths using these matricies should work out:

P_rect_03: 7.215377e+02 0.000000e+00 6.095593e+02 -3.395242e+02 0.000000e+00 7.215377e+02 1.728540e+02 2.199936e+00 0.000000e+00 0.000000e+00 1.000000e+00 2.729905e-03

Then for example looking at f_x:

7.215377e+02 / 1242 = 0.58094....

...which is approximately what we have in self.K.

from monodepth2.

wuyujack avatar wuyujack commented on September 7, 2024

@wuyujack the post-rectification projection matrix has identity rotation matrix. So, projection matrix can be decomposed into intrinsics and extrinsics matrix, where the rotation component of the extrinsics matrix is identity.

In general, you need to do calibration to estimate intrinsics and extrinsics. Then you can rectify these matrices for the two views and compute projection matrices in a new space where the optical axes of the two cameras are parallel to each other (i.e. the rotation matrix is identity). This OpenCV function does the rectification:
https://docs.opencv.org/3.3.1/d9/d0c/group__calib3d.html#ga617b1685d4059c6040827800e72ad2b6

Thank you for your detailed reply and if the post-rectification projection matrix has identity rotation matrix then it makes sense to me right now.

By the way, I only use a single camera to obtain own color image sequence, so it is no need to do rectify and I can just use my camera intrinsic matrix to replace the one in your code (self.K), is it correct?

from monodepth2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.