zhkkke / dualstudent Goto Github PK

View Code? Open in Web Editor NEW

119.0 119.0 15.0 10.98 MB

Code for Paper ''Dual Student: Breaking the Limits of the Teacher in Semi-Supervised Learning'' [ICCV 2019]

Python 100.00%

domain-adaptation iccv2019 pytorch-implementation semi-supervised-learning

dualstudent's People

Contributors

Stargazers

Watchers

Forkers

xiaopingzeng youtang1993 liuguoyou yyht sailfish009 alandene petruskkke cieusy asdlei99 mldl huang-jingwei mengkunzhao aiyeshi pipixiapipi

dualstudent's Issues

Questions about initialization

Hi and thanks for your interesting work!

I went through you code and I am just wondering how to initialize 2 students(set different seed?)?

Thanks in advance,

best regards,

about the weight of consistency losses

Hi !
I like your idea and I want to use your code as a baseline to study SSL.
However, I wonder that the scale of consistency losses.
You set the weights to 1:10:100.
(Plus, I've also found that 'mean teacher' which is your baseline, set the consistency weight to 100. )
I think these kind of values are not general since people usually set weight equally.

In my opinion, small weight for 'supervised loss' can help prevent from overfitting but it seems that it is too immoderate to choose such 1:100.

I'd like to ask why you set these parameters as such values.

Thanks !

question about paper

I would like to know that How to understand
" the outputs of these two models may vary widely, and applying the consistency constraint directly will cause them to collapse into each other by exchanging the wrong knowledge"

Here "cause them to collapse into each other" means what?

Does it Mean that imposing consistency on the output of two networks will cause the two networks to converge the same? Then why doesn't EMA have this problem.
I don't understand it very well.

How to read the training.msgpack file after training?

Questions about the paper

Hello, your paper is indeed exciting and insightful, while after I finished reading, I still have some questions unsolved.
I will appreciate it if you could spend some time answering the following:

You proved that the teacher and the student converge to the same point (eventually), but why this convergence is bad anyway, and why can't they converge to a rather good solution?
Secondly, I wonder why the idea of "stable samples" works. Sometimes I think we don't care much about them: for example in active learning people tend to know those near the decision boundary, which are "unstable samples", because they carry more information than stable ones. In your paper, with all due respect and let me point it out, you speculated that "applying the consistency constraint directly will cause them to collapse into each other by exchanging the wrong knowledge", which is not supported by any evidence, and I wonder why the proposed "stable points", on which the consistency constraint is applied, could help solve the collapsing problem you mentioned. Do you have any other derivation on that?
Finally, you spent rather short term of discourse to do the ablation study and I'm curious whether there is numerical performance when you not "training with stable samples"?

Thanks for your time again! In general I think the overall idea is very insightful!

Question about the Dual Student model

Hi and thanks for your interesting work!

I went through you code and Iam just wondering why the CNN13 model has two outputs?

Thanks in advance,

best regards,

Loss value equals to NAN!

Thx for your marvelous work!
I am trying to use your method as my baseline, but I find that if I set the epoch larger than 300(which is set in your script originally), after the 200 epoch, the loss value will be NAN.
I cannt figure it out what's wrong, do you have any idea about that?
Thx!

Hi I'm trying to run the code in colab with pytorch 1.9 and Cuda 102

Hi, I'm trying to run this code in colab with the pytorch 1.9 and cuda 102, but need your help in fine tuning the Mean-teacher part of the code.

Do l_input and r_input belong to the same image but augmented differently?

Hi I was looking through the code but I couldn't figure out whether l_input and r_input belong to the same image but augmented differently (different noise applied). Can you clarify? Thanks in advance!

zhkkke / dualstudent Goto Github PK

dualstudent's People

Contributors

Stargazers

Watchers

Forkers

dualstudent's Issues

Questions about initialization

about the weight of consistency losses

question about paper

How to read the training.msgpack file after training?

Questions about the paper

Question about the Dual Student model

Loss value equals to NAN!

Hi I'm trying to run the code in colab with pytorch 1.9 and Cuda 102

Do l_input and r_input belong to the same image but augmented differently?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent