Could you please let me know whether the eq. 4 in the paper is applicable for multi la

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

About the eq.4 about uq_bnn HOT 23 CLOSED

ykwon0407 commented on August 28, 2024

About the eq.4

from uq_bnn.

Comments (23)

mongoose54 commented on August 28, 2024 1

@ykwon0407 Thank you for the reply. I am posting here my reply to follow the thread properly.

A couple of clarification questions:

For multi-label segmentation with K classes then we need to perform the algebra calculations: transposing ?
In your paper the proof of eq. 4 lies in paragraph A of the Appendix. Is that correct?

from uq_bnn.

mongoose54 commented on August 28, 2024 1

@ykwon0407 Thanks again for the wonderful explanation.
In regards to the KxK matrix for K classes, what does each element represent (is it the degree of uncertainty between 2 classes) and what is the best way to get a single uncertainty value?

from uq_bnn.

ShellingFord221 commented on August 28, 2024 1

Thanks again for your kindly reply! But prediction.shape[0] should be the number of classes, not the number of samples.

from uq_bnn.

ShellingFord221 commented on August 28, 2024 1

Besides, if I want to calculate the whole uncertainty (i.e. the sum of two uncertainties), should I:

first calculate the sum of diagonal elements in aleatoric matrix and the sum of diagonal elements in epistemic matrix, then calculate the sum of these two numbers
directly calculate the sum of aleatoric matrix and epistemic matrix as the final matrix, then calculate the sum of diagonal elements in this final matrix

from uq_bnn.

ykwon0407 commented on August 28, 2024

Dear redsadaf,

Thank you for your interests! The eq.4 in the paper is defined for multi-label segmentation. So you can apply the equation for not only binary segmentation but multi-class segmentation problems. Please note that if the eq.4 will provide a K by K matrix if there are K categories in your dataset.
.
In the case of binary classification (when K=2), the eq.4 produces a 2 by 2 matrix. However, the two diagonal elements are just the same each other, and similarly, two different off-diagonal terms are also same. Thus, we obtain numeric values, not matrices, for uncertainty maps.
.
Please let me know if you have any further questions and hope this is informative!!

(I copy and paste the reply in #1 ).

from uq_bnn.

ykwon0407 commented on August 28, 2024

@mongoose54 Hi~ Here is the point-by-point response.

Yes, it is. You need to use transpose. Please note that resulting uncertainty matrices are a K by K matrix.
Quite close but not exactly. Appendix A shows the derivation of the equation (2), which is a population version of uncertainties. In contrast, the equation (4) is an estimator for the equation (2) !! That is, we are only able to utilize the equation (4) with data, and its converging point is the equation (4), the variance of a variational predictive distribution.

from uq_bnn.

ykwon0407 commented on August 28, 2024

@mongoose54 Hello! :)

The K by K matrix can be considered as a proxy to a variance matrix of a multinomial distribution. So each element in the uncertainty matrix is nothing but correlation or variance of two components of an outcome.
[Additional information for 1.] Writing a dependent variable Y with one-hot encoding expression, then Y is a K length vector and it can be assumed to follow a multinomial distribution. I believe that you can easily find almost the same thing in the following wiki. Multinomial Wiki
First of all, as explained in the previous point, each element has the meaning. Thus, picking a specific element may give you some information. I guess there are so many examples can be made but the most interesting example is 'the sum of diagonal elements in the aleatoric uncertainty matrix', which can be shown to be very similar to Shannon's entropy.

from uq_bnn.

ShellingFord221 commented on August 28, 2024

Sorry, why does Eq. 4 provide a K*K matrix when there are K classes?

from uq_bnn.

ShellingFord221 commented on August 28, 2024

Besides, p_hat is just a list of 10 probabilities of some certain class (according to line 63 in /retina/utils.py), why does it have diagonal matrix?

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221 Hi~~ Here is the point-by-point response.

Sorry, why does Eq. 4 provide a K*K matrix when there are K classes?
-> In case you are solving a K-class classification problem, then a probability estimate (p_hat) will be represented as a K-length vector. Then, the proposed uncertainties, which can be considered as a naive variance, are nothing but a K by K matrix.

Besides, p_hat is just a list of 10 probabilities of some certain class (according to line 63 in /retina/utils.py), why does it have diagonal matrix?
-> If you run with setting a number of a random draw T as 10, then p_hat will be a (10, ) numpy array. I have no idea about the diagonal matrix...

from uq_bnn.

ShellingFord221 commented on August 28, 2024

The diagonal matrix is mentioned in Eq. 4 in you paper, diag(p_hat)

from uq_bnn.

ShellingFord221 commented on August 28, 2024

emmm... p_hat shoud be a matrix of size (num_samples, num_classes) (i.e. (10, 3) in my settings)?

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221

The diagonal matrix is mentioned in Eq. 4 in you paper, diag(p_hat)
-> Ah. I got it. The diagonal matrix is from the covariance matrix of the multinomial distribution. Please find the link.

emmm... p_hat shoud be a matrix of size (num_samples, num_classes) (i.e. (10, 3) in my settings)?
-> Yes, it is. Sorry for my binary classification code... (it assumes a lot..)

Let me clear all the details in the following

In /retina/utils.py

p_hat = np.array(p_hat) # line number 64
prediction = np.mean(p_hat, axis=0) # line number 67

p_hat should be a numpy array of size (num_samples, num_classes)
prediction should be a numpy array of size (num_classes, )

Then the aleatoric and epistemic matrix will be as follows.

aleatoric = np.diag(prediction) - p_hat.T.dot(p_hat)/p_hat.shape[0] # 3 by 3 matrix # I corrected an error after the discussion with ShellingFord221
tmp = p_hat - prediction  # 10 by 3 matrix
epistemic = tmp.T.dot(tmp)/tmp.shape[0]

Hope this information helps you!

from uq_bnn.

ShellingFord221 commented on August 28, 2024

Thank you so much!!! But there's still a little question. In Eq. 4 of your paper, the diag is about p_hat, but in your codes above, it seems that diag is about prediction (the mean of p_hat).

from uq_bnn.

ShellingFord221 commented on August 28, 2024

And why should the dot product of the matrix be divided by shape[0]? (p_hat.T.dot(p_hat)/prediction.shape[0])

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221 You're welcome! :)
1.
In Eq. 4 of your paper, the diag is about p_hat, but in your codes above, it seems that diag is about prediction (the mean of p_hat).
-> Because an average of diagonal of p_hat equals a diagonal matrix of prediction.

And why should the dot product of the matrix be divided by shape[0]? (p_hat.T.dot(p_hat)/prediction.shape[0])
-> In Eq.4, we need to divide by a number of random samples T, so I divide the p_hat.T.dot(p_hat) by prediction.shape[0].

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221 You are right! My bad.
It should be p_hat.shape[0], not prediction.shape[0].
I corrected the above code as well. Thanks!!!

from uq_bnn.

ShellingFord221 commented on August 28, 2024

The sum of diagonal elements in the aleatoric uncertainty matrix is meaningful, is the sum of diagonal elements in the epistemic uncertainty matrix meaningful, too?
Besides, does the aleatoric uncertainty mean the uncertainty about the test data, and epistemic uncertainty mean the uncertainty about the model?

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221

The sum of diagonal elements in the aleatoric uncertainty matrix is meaningful, is the sum of diagonal elements in the epistemic uncertainty matrix meaningful, too?
-> I guess.. somehow yes.

Besides, does the aleatoric uncertainty mean the uncertainty about the test data, and epistemic uncertainty mean the uncertainty about the model?
-> They are not exactly separated, but it can be considered.

from uq_bnn.

ShellingFord221 commented on August 28, 2024

The claim that aleatoric uncertainty means the uncertainty about the test data and epistemic uncertainty means the uncertainty about the model is also from this paper, Bayesian Convolutional Neural Networks with Variational Inference (the paragraph above section 6 experiments). But I have read his code, he mistakes the calculation of uncertainty about binary classification and multi-label classification, therefore his result about these two uncertainty is a number, rather than a K*K matrix (Table 2 in his paper).

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221 Either way is fine!

from uq_bnn.

ShellingFord221 commented on August 28, 2024

Hi, after a half of year, it seems that I am confused again about the code above o(╥﹏╥)o .
The diag of p_hat is averaged, but the p_hat.T.dot(p_hat) part seems only divided by the number of samples. But in Eq. 4, this part should be summed, then be divided (i.e. first \sum_(t=1)^T p_hat.T.dot(p_hat), then this part is divided by the number of samples). And the situation is the same about tmp.T.dot(tmp) part (it is also divided by the number of samples in the code above, but there is no sum of T parts).

from uq_bnn.

ykwon0407 commented on August 28, 2024

@ShellingFord221 Hi, again!. The dot product operations will sum over elements. Please see this link together https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html.

from uq_bnn.

About the eq.4 about uq_bnn HOT 23 CLOSED

Comments (23)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent