Comments (6)
Hi,
It doesn't work with batch size 0 why is there no workaround? Is that part of the method?
I assume you mean batch_size=1
, there is no such thing as batch_size=0
.
Anyway, since the covariance is calculated batch-wise, there cannot be a covariance matrix for a single element. In fact, the bigger the batch, the better, since it better approximates the statistics of the source and target domains. Also, batches which are significantly smaller than the dimension of the latent space may lead to numerical issues.
Is the cov result supposed to be complex?
The covariance matrix is symmetric and real, thus, guaranteed to have real eigvenvalues by the spectral theorem (I hope I am interpreting your question correctly).
Did you try yet to apply this method to latent spaces? Or are you aware of anyone doing it?
The method is indeed applied to the latent space spanned by the penultimate feature layer of the considered CNN.
I hope my answer can help, glad to clarify further if you need it.
Best,
P.
from minimal-entropy-correlation-alignment.
Thanks for the quick reply @pmorerio !
My bad, I was counting like a computer ;) ofc I mean batch_size=1
.
Just wondering because (1. / (batch_size - 1))
will result in a division by zero in case of batch size 1.
Is there any recommended minimum batch size?
Sorry, I was referring to the log_cov which gets complex in my case.
Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.
Thanks for your help!
Bjarne
from minimal-entropy-correlation-alignment.
HI @bjajoh,
Is there any recommended minimum batch size?
Empirically, I would recommend to have it at least equal to size of the latent space (hidden_size
). In the example provided hidden_size=64
and batch_size=256
, i.e. four times it.
Sorry, I was referring to the log_cov which gets complex in my case.
It may get complex because of numerical issues arising exactly from having a small batch size. You can try with larger batches and/or by regularizing covariance matrices by adding small values on the diagonal (decomment the last part of
)Can you point me in the right direction who used it on latent spaces?
I saw it is used by many, but only saw it on output logits.
As you can see I am actually applying it on the latent space spanned by the feature layer before the logits layer. In principle you can apply to any layer of the network
.Let me know if this clarifies better!
Best,
P.
from minimal-entropy-correlation-alignment.
Hey @pmorerio ,
thanks for the clarification!
My latent space is for example 28x28x64, currently I'm averaging the 3rd axis down to a 2D grid. It results in a stable loss even with smaller batch sizes. Is this a viable method? Or is it braking the underlying meaning of the method?
Thanks for your help!
from minimal-entropy-correlation-alignment.
Hi,
what the loss would like as input are actually matrices of size (batch_size, hidden_space_size). Your 2d grid adds an extra dimension, so you should vectorize it, however this will result in a high-dimensional vector (784 - could be too much, but you can try). Alternatively you can average along the spatial dimension in order to get a vector of length 64.
Stable loss could be because the weight of the loss is very low.
Hope this helps.
P.
from minimal-entropy-correlation-alignment.
Hi @pmorerio ,
thank you sooo much!
This is extremely helpful!
Best,
Bjarne
from minimal-entropy-correlation-alignment.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from minimal-entropy-correlation-alignment.