Giter Club home page Giter Club logo

Comments (4)

AlessioTonioni avatar AlessioTonioni commented on June 3, 2024

Theoretically if I use the same images for the adaption that I used the for training, then the network should perform the same if not better.

Not really because the loss used for the adaptation is far from being perfect and is actually very noisy. So what you are experiencing is completely reasonable. When you train with gt you have perfect supervision and no noise in the training process, when you use the self-supervised loss you inject noise in the training process and the performance starts to decrease. Self-supervision makes sense only if you don't have any other source of ground truth.

The interesting point is that if I use my own stereo endoscope images for the Full mode adaptation, then after a few hundred iterations the result starts to look good.

Again this is expected because you deploy the network on completely different images, so the initial performance is quite bad. Even if the self-supervised loss is very noisy it will still provide reasonable training signals that should improve the prediction from really bad to something better.

So I am not sure pretraining the network using the Blender images make any improvement in depth estimation compared with if I just had used the unsupervised approach such as Godard 2017.

In my experience, the pretraining helps to converge faster once you start to train in a new environment. Think of it like pretraining a classification network on imagenet and then fine-tuning for a different dataset. If you start from scratch on real images without any kind of precise supervision I think you will get worse performance, but you can definitely try it.

from real-time-self-adaptive-deep-stereo.

YJonmo avatar YJonmo commented on June 3, 2024

Thanks for the feedback!

By the way I am also trying to get the pose of the camera in the environment.
Is there any approach to to find the camera pose similar to your approach of pretraining on ground-truth and then adapting to real stereo images without the pose?
The Monodepth2 by Godard estimates the pose but it is unsupervised.

Thanks

from real-time-self-adaptive-deep-stereo.

AlessioTonioni avatar AlessioTonioni commented on June 3, 2024

As far as I know no, but the simplest thing would be to take something similar to the pose estimation network from Monodepth2 by Godard and add a supervised loss on the pose of the camera. Assuming that you have groundtruth for the pose of the camera in a trainign set.

from real-time-self-adaptive-deep-stereo.

YJonmo avatar YJonmo commented on June 3, 2024

thank you

from real-time-self-adaptive-deep-stereo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.