iffsid / mmvae Goto Github PK

View Code? Open in Web Editor NEW

183.0 183.0 40.0 1.96 MB

Multimodal Mixture-of-Experts VAE

License: GNU General Public License v3.0

Emacs Lisp 0.06% Python 99.94%

mmvae's People

Contributors

Stargazers

Watchers

mmvae's Issues

umap library version and embed_umap()

Thanks for sharing your model implementation. I have installed all dependencies as specified in the readme file. However, there is something that it is wrong. In the function embed_umap() there is an argument called transform_seed. It seems like transform_seed was first introduced in umap-learn==0.3.0 and not in umap-learn==0.1.1. It does not help to install umap-learn==0.3.0 because that will trigger an error message since there is a dependency with joblib in sklearn. So sklearn must be an older version, but I am not sure which one. I maneged to run python main.py --model mnist_svhn by replacing UMAPwith TSNE in the function embed_umap.

Another issue is that the code seems to be running very slow, eventhough it runs in a GPU. Any idea what can it be happening? can it be the function embed_umap() that it is triggered at each epoch?

detaching qz_x params in dreg objective

Hi,

In the dreg function, why do you detach the encoder params using the following line of code?

qz_x = model.qz_x(*[p.detach() for p in model.qz_x_params]) # stop-grad for \phi

Does it mean that the encoder parameters are not updated during training?

Thanks,
VR

Query regarding Mixture-of-Experts

Hello, thanks for sharing the code.

This is not an issue, but rather a question of where the Mixture-of-Experts operation is involved in the mmvae.py code. I thought it would be used in forward() but I cannot somehow clearly identify it in these lines:

    def forward(self, x, K=1):
        qz_xs, zss = [], []
        # initialise cross-modal matrix
        px_zs = [[None for _ in range(len(self.vaes))] for _ in range(len(self.vaes))]
        for m, vae in enumerate(self.vaes):
            qz_x, px_z, zs = vae(x[m], K=K)
            qz_xs.append(qz_x)
            zss.append(zs)
            px_zs[m][m] = px_z  # fill-in diagonal
        for e, zs in enumerate(zss):
            for d, vae in enumerate(self.vaes):
                if e != d:  # fill-in off-diagonal
                    px_zs[e][d] = vae.px_z(*vae.dec(zs))
        return qz_xs, px_zs, zss

It seems like qz_xs don't mix and px_zs are only computed with the posterior distributions of single modalities. Don't you need to use mixture of experts to combine here the posterior distributions qz_xs ?

Thanks,

Reproduce results in Table 2 and Table 4

Hi! Thanks for sharing this great project!
I trained the model with your suggested settings and also evaluated your provided trained model, but in both ways I can't reproduce results as reported in Table 2 and Table 4, especially joint coherence.
Can you give any hints?
Thanks.

Query regrading scale of individual modality likelihood

Hi,

Thanks for sharing the code.

This is not an issue, rather a query.
I was wondering how did you come up with the scale of individual modality likelihood? (e.g; 0.75 for MNIST). Also, how do I decide that scale for a new dataset?

Thanks,

Where is the cub.vocab ?

Hi, I can't find cub.vocab of Caltech-UCSD Birds (CUB) dataset. Can you upload the file? Thanks!

Where is VAE_cubISft?

python main.py --model cubISft

load model

modelC = getattr(models, 'VAE_{}'.format(args.model))
model = modelC(args).to(device)

Number of IWAE samples used for training

Thanks for sharing the code. Can you please tell me how many IWAE samples did you used for training the model.

Also did you tried training the model with the vanilla iwae estimator? If yes, then can you please share what likelihoods did you get from it and from the DREG estimator.

Thanks a lot.

error in unzip pre-trained models

Hi! Thanks for sharing this amazing project.
I'm having problem unzipping the pre-trained models you uploaded.
Could you check it?
Thanks.

Sorry problem solved. Thanks

Caltech-UCSD Birds (CUB) dataset

Dear authors,

I'm following your work and I find link to cleaned-up version Caltech-UCSD Birds (CUB) dataset has expired. Could you please share a new link to that dataset?

Thank you!

"ft"

what is the difference between "mmvae_cub_images_sentences_ft" and "mmvae_cub_images_sentences"

iffsid / mmvae Goto Github PK

mmvae's People

Contributors

Stargazers

Watchers

Forkers

mmvae's Issues

load model

Recommend Projects

Recommend Topics

Recommend Org