Giter Club home page Giter Club logo

Comments (5)

RuiTianHIT avatar RuiTianHIT commented on June 9, 2024 1

Thank you very much for the author's reply and guidance, through your help we have located the problem.

  1. We compared the author's baseline checckpoint with our baseline checkpoint, and our baseline checkpoint produced a large error.
  2. In recent days, we have reproduce the baseline and the results are similar to the results you provided for the baseline checkpoint. We provide two results in the appendix, one is the result of the baseline we trained in recent days, and the other is the result of the baseline provided by the author. We think the reproduction quality is very high.
    eval-baseline-80m-nuscenes-val_result_metrics_14October2023at18_39_20CST.csv
    eval-baseline-80m-nuscenes-val_result_metrics_14October2023at21_58_46CST.csv

Our problem has been solved, the initial problem is likely that we did not use last.ckpt for distilling md4all, we will double-check whether this is the cause of the error, and we hope to follow the high-quality work.
Finally, thank you again for your help.

from md4all.

morbi25 avatar morbi25 commented on June 9, 2024

Hey @RuiTianHIT,

Thank you for your interest in our work and your questions!

  1. As you already pointed out our training is a two-stage process. First, one needs to train an arbitrary baseline model in ideal conditions. In our work, we used a monodepth2 + velocity loss (introduced by the PackNet paper) to achieve scale-aware depth estimates. You can try one of the following commands to train it, e.g., for nuScenes.

    Docker (recommended):

    make docker-train-baseline-nuscenes NAME=train-baseline-nuscenes

    Conda:

    python train.py --config <PATH_TO_MD4ALL>/config/train_baseline_nuscenes.yaml

    Then you can train md4allDDa with a mix of images in ideal conditions and translated adverse conditions (provided on our project page) using the baseline's depth prediction operating always on the corresponding ideal conditions to supervise the depth model that is currently trained with the day distillation loss. For example for nuScenes, that should be possible using one of the following commands:

    Docker (recommended):

    make docker-train-md4allDDa-nuscenes NAME=train_md4allDDa_nuscenes

    Conda:

    python train.py --config <PATH_TO_MD4ALL>/config/train_md4allDDa_nuscenes.yaml

    If you are interested in the specific details, I recommend section 3.1 of our paper. :)

  2. In the files you shared it seems that the md4allDDa you trained is not able to predict scale-aware depth estimates in comparison to the checkpoint that we shared as the evaluation metrics without the postfix 'gt' are quite bad (close to 0 for accuracies, high for errors). Therefore, it would be interesting to check the performance of the baseline that you trained in the first step because md4allDDa highly depends on the performance of the baseline model. You can do this with one of the following commands using the configuration files that I uploaded today and adapting the checkpoint path to your specific baseline model:

    Docker (recommended):

    make docker-eval-baseline-80m-nuscenes-val NAME=eval-baseline-80m-nuscenes-val

    Conda:

    python evaluation/evaluate_depth.py --config <PATH_TO_MD4ALL>/config/eval_baseline_80m_nuscenes_val.yaml

    Alternatively, to simplify the training of md4allDDa for you, I also uploaded the checkpoint baseline_nuscenes.ckpt to the drive that you can directly use for training md4allDDa on nuScenes. Please be aware that this is not the same baseline checkpoint that we reported in Table 1 of the paper as it originates from an older code base version.

  3. The dataset size should be correct, we doubled the indices that are used for training, e.g., to enforce that the validation is only done every 2 epochs since it takes a bit of time for nuScenes. It is controlled by the TRAINING.REPEAT parameter. By setting it to 1, you should see 30258 / 2 = 15129. However, it should be able to achieve the same behavior without doubling the indices using the check_val_every_n_epoch argument of the pl trainer and setting it to 2.

  4. This view should provide an overview of how many scenes are included in a particular split with a specific weather condition.

I hope the answers help you, do not hesitate to reach out again if you have further questions! Have a nice weekend! :)

from md4all.

sgasperini avatar sgasperini commented on June 9, 2024

Hi @RuiTianHIT, thank you for your interest in our work!

Following up on your request and complementing @morbi25's extensive answer, please consider we recently shared the weights of the pre-trained baseline model (here). By using such pre-trained weights (and the translated images we offer here), you can skip the first stages of our method, easing the training of the -DD models and facilitating the reproduction of our final results.

Please let us know if you encounter additional issues reproducing the results after following our comments.

from md4all.

RuiTianHIT avatar RuiTianHIT commented on June 9, 2024

@sgasperini @morbi25 Thank you very much for your help. Our current work focuses on distilling md4all through the trained baseline.

from md4all.

sgasperini avatar sgasperini commented on June 9, 2024

Dear @RuiTianHIT, that is great to hear!
Thanks a lot for the update. Then I will close this issue. Feel free to reopen it if you experience other problems.
We wish you good luck with your research.

from md4all.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.