Giter Club home page Giter Club logo

dualstudent's People

Contributors

zhkkke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dualstudent's Issues

Questions about initialization

Hi and thanks for your interesting work!

I went through you code and I am just wondering how to initialize 2 students(set different seed?)?

Thanks in advance,

best regards,

about the weight of consistency losses

Hi !
I like your idea and I want to use your code as a baseline to study SSL.
However, I wonder that the scale of consistency losses.
You set the weights to 1:10:100.
(Plus, I've also found that 'mean teacher' which is your baseline, set the consistency weight to 100. )
I think these kind of values are not general since people usually set weight equally.

In my opinion, small weight for 'supervised loss' can help prevent from overfitting but it seems that it is too immoderate to choose such 1:100.

I'd like to ask why you set these parameters as such values.

Thanks !

question about paper

I would like to know that How to understand
" the outputs of these two models may vary widely, and applying the consistency constraint directly will cause them to collapse into each other by exchanging the wrong knowledge"

Here "cause them to collapse into each other" means what?

Does it Mean that imposing consistency on the output of two networks will cause the two networks to converge the same? Then why doesn't EMA have this problem.
I don't understand it very well.

Questions about the paper

Hello, your paper is indeed exciting and insightful, while after I finished reading, I still have some questions unsolved.
I will appreciate it if you could spend some time answering the following:

  1. You proved that the teacher and the student converge to the same point (eventually), but why this convergence is bad anyway, and why can't they converge to a rather good solution?

  2. Secondly, I wonder why the idea of "stable samples" works. Sometimes I think we don't care much about them: for example in active learning people tend to know those near the decision boundary, which are "unstable samples", because they carry more information than stable ones. In your paper, with all due respect and let me point it out, you speculated that "applying the consistency constraint directly will cause them to collapse into each other by exchanging the wrong knowledge", which is not supported by any evidence, and I wonder why the proposed "stable points", on which the consistency constraint is applied, could help solve the collapsing problem you mentioned. Do you have any other derivation on that?

  3. Finally, you spent rather short term of discourse to do the ablation study and I'm curious whether there is numerical performance when you not "training with stable samples"?

Thanks for your time again! In general I think the overall idea is very insightful!

Question about the Dual Student model

Hi and thanks for your interesting work!

I went through you code and Iam just wondering why the CNN13 model has two outputs?

Thanks in advance,

best regards,

M

Loss value equals to NAN!

Thx for your marvelous work!
I am trying to use your method as my baseline, but I find that if I set the epoch larger than 300(which is set in your script originally), after the 200 epoch, the loss value will be NAN.
I cannt figure it out what's wrong, do you have any idea about that?
Thx!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.