Giter Club home page Giter Club logo

Comments (6)

pmorerio avatar pmorerio commented on June 2, 2024

Hi,

It doesn't work with batch size 0 why is there no workaround? Is that part of the method?

I assume you mean batch_size=1, there is no such thing as batch_size=0.
Anyway, since the covariance is calculated batch-wise, there cannot be a covariance matrix for a single element. In fact, the bigger the batch, the better, since it better approximates the statistics of the source and target domains. Also, batches which are significantly smaller than the dimension of the latent space may lead to numerical issues.

Is the cov result supposed to be complex?

The covariance matrix is symmetric and real, thus, guaranteed to have real eigvenvalues by the spectral theorem (I hope I am interpreting your question correctly).

Did you try yet to apply this method to latent spaces? Or are you aware of anyone doing it?

The method is indeed applied to the latent space spanned by the penultimate feature layer of the considered CNN.

I hope my answer can help, glad to clarify further if you need it.
Best,
P.

from minimal-entropy-correlation-alignment.

bjajoh avatar bjajoh commented on June 2, 2024

Thanks for the quick reply @pmorerio !

My bad, I was counting like a computer ;) ofc I mean batch_size=1.
Just wondering because (1. / (batch_size - 1)) will result in a division by zero in case of batch size 1.
Is there any recommended minimum batch size?

Sorry, I was referring to the log_cov which gets complex in my case.

Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.

Thanks for your help!
Bjarne

from minimal-entropy-correlation-alignment.

pmorerio avatar pmorerio commented on June 2, 2024

HI @bjajoh,

Is there any recommended minimum batch size?

Empirically, I would recommend to have it at least equal to size of the latent space (hidden_size). In the example provided hidden_size=64 and batch_size=256, i.e. four times it.

Sorry, I was referring to the log_cov which gets complex in my case.

It may get complex because of numerical issues arising exactly from having a small batch size. You can try with larger batches and/or by regularizing covariance matrices by adding small values on the diagonal (decomment the last part of

cov_source = (1./(batch_size-1)) * tf.matmul( h_src, h_src, transpose_a=True) #+ gamma * tf.eye(self.hidden_repr_size)
)

Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.

As you can see I am actually applying it on the latent space spanned by the feature layer before the logits layer. In principle you can apply to any layer of the network

self.domain_loss = self.alpha * self.log_coral_loss(self.src_hidden, self.trg_hidden)
.

Let me know if this clarifies better!
Best,
P.

from minimal-entropy-correlation-alignment.

bjajoh avatar bjajoh commented on June 2, 2024

Hey @pmorerio ,

thanks for the clarification!

My latent space is for example 28x28x64, currently I'm averaging the 3rd axis down to a 2D grid. It results in a stable loss even with smaller batch sizes. Is this a viable method? Or is it braking the underlying meaning of the method?

Thanks for your help!

from minimal-entropy-correlation-alignment.

pmorerio avatar pmorerio commented on June 2, 2024

Hi,
what the loss would like as input are actually matrices of size (batch_size, hidden_space_size). Your 2d grid adds an extra dimension, so you should vectorize it, however this will result in a high-dimensional vector (784 - could be too much, but you can try). Alternatively you can average along the spatial dimension in order to get a vector of length 64.

Stable loss could be because the weight of the loss is very low.

Hope this helps.
P.

from minimal-entropy-correlation-alignment.

bjajoh avatar bjajoh commented on June 2, 2024

Hi @pmorerio ,

thank you sooo much!
This is extremely helpful!

Best,
Bjarne

from minimal-entropy-correlation-alignment.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.