Giter Club home page Giter Club logo

Comments (4)

YugeTen avatar YugeTen commented on August 17, 2024

Hi, thank you for the interest in our work!

Could you possibly share the args you ran the experiments on and the results you are getting? Thanks!

from mmvae.

Faye3321 avatar Faye3321 commented on August 17, 2024

For training, I was using your suggested settings. For your provided trained model, the joint coherence on CUB I got is around 0.1 (VS 0.263 in Table 4)

from mmvae.

YugeTen avatar YugeTen commented on August 17, 2024

Updates

Thanks for bringing the issues to our attention. We've now tracked down the reason for discrepancies between the released code and the reported results -- in the effort to clean up and publish our code, there were a couple of minor things we missed out.

  1. The (fixed) scales for the individual-modality likelihoods were transferred incorrectly---should have been 0.75 instead of 0.1;
  2. We didn't include the trained FastText embeddings used to compute CCA and cross/joint coherence scores.;

We have fixed the above in the most recent commit. Along with the code update, we have also uploaded new pretrained-models for both MNIST-SVHN and CUB datasets that will reproduce similar results to what's reported in Table 2 and Table 4 of our paper -- see README for more details.

We do apologise for the confusion this inconsistency caused, and thank you again for bringing this forward.

Explanation for CUB discrepancy

To expand a bit on (2), for measuring cross/joint coherence on CUB, we use off-the-shelf ResNets for the cub-image data, but train FastText embeddings for cub-sentence data -- since it's vocabulary is quite different from what is typically used to train FastText.

We then use these embeddings (ResNet, trained FastText) to compute CCA on the ground-truth image and sentence training data, and use the learnt embeddings to compute the correlation for generated samples from our model.

The learnt embeddings however, can vary quite a bit due to the limited dataset size. The embeddings we used to report results in the paper were not saved with our models, so re-computing them as part of the analyses can result in different numeric values including for the baseline. Note that the relative performance of our model against the baseline remains the same, just that the numbers can be different.

Reproducing CUB results (Table 4)

We have done a quick search for the FastText embeddings that produce the same results on the baseline as reported in the paper, and re-computed the CCA and cross/joint coherence scores on our model with this. To produce similar results to what's reported in our paper, download the zip file here and do the following:

  1. Move cub.all, cub.emb, cub.pc to under data/cub/oc:3_sl:32_s:300_w:3/;
  2. Move the rest of the files, i.e. emb_mean.pt, emb_proj.pt, images_mean.pt, im_proj.pt to path/to/trained/model/folder/;
  3. Set the RESET variable in src/report/analyse_cub.py to False (line 21).

With these two fixes, the results from the code match those in the paper (with even improved cross-coherence scores on cub )

from mmvae.

Faye3321 avatar Faye3321 commented on August 17, 2024

Thanks for your clarifications

from mmvae.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.