Comments (4)
Theoretically if I use the same images for the adaption that I used the for training, then the network should perform the same if not better.
Not really because the loss used for the adaptation is far from being perfect and is actually very noisy. So what you are experiencing is completely reasonable. When you train with gt you have perfect supervision and no noise in the training process, when you use the self-supervised loss you inject noise in the training process and the performance starts to decrease. Self-supervision makes sense only if you don't have any other source of ground truth.
The interesting point is that if I use my own stereo endoscope images for the Full mode adaptation, then after a few hundred iterations the result starts to look good.
Again this is expected because you deploy the network on completely different images, so the initial performance is quite bad. Even if the self-supervised loss is very noisy it will still provide reasonable training signals that should improve the prediction from really bad to something better.
So I am not sure pretraining the network using the Blender images make any improvement in depth estimation compared with if I just had used the unsupervised approach such as Godard 2017.
In my experience, the pretraining helps to converge faster once you start to train in a new environment. Think of it like pretraining a classification network on imagenet and then fine-tuning for a different dataset. If you start from scratch on real images without any kind of precise supervision I think you will get worse performance, but you can definitely try it.
from real-time-self-adaptive-deep-stereo.
Thanks for the feedback!
By the way I am also trying to get the pose of the camera in the environment.
Is there any approach to to find the camera pose similar to your approach of pretraining on ground-truth and then adapting to real stereo images without the pose?
The Monodepth2 by Godard estimates the pose but it is unsupervised.
Thanks
from real-time-self-adaptive-deep-stereo.
As far as I know no, but the simplest thing would be to take something similar to the pose estimation network from Monodepth2 by Godard and add a supervised loss on the pose of the camera. Assuming that you have groundtruth for the pose of the camera in a trainign set.
from real-time-self-adaptive-deep-stereo.
thank you
from real-time-self-adaptive-deep-stereo.
Related Issues (20)
- about the implementation details in MadNet HOT 2
- I want to get right disparity. HOT 3
- Problem when train from scratch HOT 5
- [Feature requested] Could you please provide PyTorch version? HOT 1
- Confusion about loss calculation with loss_factory api HOT 4
- Feature request: Tensorflow lite model HOT 1
- collab example HOT 3
- MADNet's EPE on sceneflow? HOT 1
- How can I get proxy disparity? HOT 1
- failed to load pretrained weights HOT 1
- About correlation layer. HOT 1
- Shape must be rank $ but is rank 3 for 'model/MirrorPad' HOT 3
- Ground-truth of raw data in KITTI dataset HOT 2
- MADNet Tensorflow 2 Implementation HOT 2
- A warning about 'get_proxy_loss' in Stereo_Continual_Adaptation.py HOT 1
- A confusion about the metrics in Tab. 2 of your paper HOT 1
- The problem in data loading HOT 1
- TPAMI code 'NoneType' object cannot be interpreted as an integer
- The size of the picture to inference
- Question about setup and performance on kitti raw dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from real-time-self-adaptive-deep-stereo.