Giter Club home page Giter Club logo

Comments (11)

rcamino avatar rcamino commented on August 17, 2024 3

Thank you for the quick answer.

Besides the convergence question, I still have the doubt about the 0.5.

In the paper I understood that the hint was indicating:

  • 1 -> known original value
  • 0 -> known imputed value
  • 0.5 -> unknown

And the discriminator has to define if the 0.5 is an original or an inputed value.

But in this implementation, the hint shows:

  • 1 -> known original value
  • 0 -> unknown

So the hint is only helping in the known original values, but giving no hint about the missing values?

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024

In practice, providing 90% of the mask vector as the hints make the best performance. (Hint is only given to the known features)
In theory (in the paper), providing one feature as a hint converges to the optimal solution with MCAR setting.

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024

Yes.
In this code, the hint is only provided to the known variables.
Therefore, the discriminator has to determine if the 0 is an original or an imputed value.
We don't provide the imputed variables as the hint; therefore, we don't need to introduce 0.5 here.
Thanks.

from gain.

ElApseR avatar ElApseR commented on August 17, 2024

I have tested two types of hint : original paper vs. this code
and I found out that the performance of these two models were almost the same.
Although the variance of MSE test loss designed by the original paper(using 0.5 for hint) was bit higher, it didn't seem that meaningful.
2019-01-01 7 32 32

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024

Usually, on missing completely at random setting, hint does not have a big impact on the results.

from gain.

guoliangxie123 avatar guoliangxie123 commented on August 17, 2024

1.The Imputed Matrix is equal to the Hat_New_X?
2.when i try to print the Hat_New_X, I find that some 0 positions are not imputed ,Is it 0 in the original data?
look forward to your reply

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024
  1. Yes. G_sample is the output of the generator and Hat_New_X is the matrix that only missing values are replaced by G_sample.
  2. Yes. some of them have 0 as the original values.
    Thanks!

from gain.

guoliangxie123 avatar guoliangxie123 commented on August 17, 2024

Thank you for the quick answer.

  1. Is the letter data in your codes containing missing values? And has been filled with 0.
  2. I can't compare the imputed data with the original dataset because there is no raw dataset
    I recently wrote a paper to quote your paper to impute the data, but the effect is not ideal

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024
  1. No. The letter data is complete data.
  • I introduce the missing in line 51-59 and 210.
  • Please check those lines.
  1. The original raw data is always there that you can compare.
  • Please see line 233 and 186.

from gain.

ainilaha avatar ainilaha commented on August 17, 2024

In the paper, Figure 1 shows, you feed three matrixes, including data matrix, random matrix, mask matrix, but I do not see you feeding random matrix to the generator. What is the random matrix?

from gain.

jsyoon0823 avatar jsyoon0823 commented on August 17, 2024

You can see how we use random matrix in this link (https://github.com/jsyoon0823/GAIN/blob/master/gain.py#L168-L169)

from gain.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.