Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for the quick reply <a class="user-mention notranslate" data-hovercard-type="us

HI <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

log_coral_loss question about minimal-entropy-correlation-alignment HOT 6 CLOSED

pmorerio commented on June 2, 2024

log_coral_loss question

from minimal-entropy-correlation-alignment.

Comments (6)

pmorerio commented on June 2, 2024

Hi,

It doesn't work with batch size 0 why is there no workaround? Is that part of the method?

I assume you mean batch_size=1, there is no such thing as batch_size=0.
Anyway, since the covariance is calculated batch-wise, there cannot be a covariance matrix for a single element. In fact, the bigger the batch, the better, since it better approximates the statistics of the source and target domains. Also, batches which are significantly smaller than the dimension of the latent space may lead to numerical issues.

Is the cov result supposed to be complex?

The covariance matrix is symmetric and real, thus, guaranteed to have real eigvenvalues by the spectral theorem (I hope I am interpreting your question correctly).

Did you try yet to apply this method to latent spaces? Or are you aware of anyone doing it?

The method is indeed applied to the latent space spanned by the penultimate feature layer of the considered CNN.

I hope my answer can help, glad to clarify further if you need it.
Best,
P.

from minimal-entropy-correlation-alignment.

bjajoh commented on June 2, 2024

Thanks for the quick reply @pmorerio !

My bad, I was counting like a computer ;) ofc I mean batch_size=1.
Just wondering because (1. / (batch_size - 1)) will result in a division by zero in case of batch size 1.
Is there any recommended minimum batch size?

Sorry, I was referring to the log_cov which gets complex in my case.

Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.

Thanks for your help!
Bjarne

from minimal-entropy-correlation-alignment.

pmorerio commented on June 2, 2024

HI @bjajoh,

Is there any recommended minimum batch size?

Empirically, I would recommend to have it at least equal to size of the latent space (hidden_size). In the example provided hidden_size=64 and batch_size=256, i.e. four times it.

Sorry, I was referring to the log_cov which gets complex in my case.

It may get complex because of numerical issues arising exactly from having a small batch size. You can try with larger batches and/or by regularizing covariance matrices by adding small values on the diagonal (decomment the last part of

minimal-entropy-correlation-alignment/svhn2mnist/model.py

Line 65 in 3b26168

 cov_source = (1./(batch_size-1)) * tf.matmul( h_src, h_src, transpose_a=True) #+ gamma * tf.eye(self.hidden_repr_size) 

)

Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.

As you can see I am actually applying it on the latent space spanned by the feature layer before the logits layer. In principle you can apply to any layer of the network

minimal-entropy-correlation-alignment/svhn2mnist/model.py

Line 109 in 3b26168

 self.domain_loss = self.alpha * self.log_coral_loss(self.src_hidden, self.trg_hidden) 

Let me know if this clarifies better!
Best,
P.

from minimal-entropy-correlation-alignment.

bjajoh commented on June 2, 2024

Hey @pmorerio ,

thanks for the clarification!

My latent space is for example 28x28x64, currently I'm averaging the 3rd axis down to a 2D grid. It results in a stable loss even with smaller batch sizes. Is this a viable method? Or is it braking the underlying meaning of the method?

Thanks for your help!

from minimal-entropy-correlation-alignment.

pmorerio commented on June 2, 2024

Hi,
what the loss would like as input are actually matrices of size (batch_size, hidden_space_size). Your 2d grid adds an extra dimension, so you should vectorize it, however this will result in a high-dimensional vector (784 - could be too much, but you can try). Alternatively you can average along the spatial dimension in order to get a vector of length 64.

Stable loss could be because the weight of the loss is very low.

Hope this helps.
P.

from minimal-entropy-correlation-alignment.

bjajoh commented on June 2, 2024

Hi @pmorerio ,

thank you sooo much!
This is extremely helpful!

Best,
Bjarne

from minimal-entropy-correlation-alignment.

log_coral_loss question about minimal-entropy-correlation-alignment HOT 6 CLOSED

Comments (6)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent