Giter Club home page Giter Club logo

uq_bnn's Introduction

Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical imaging segmentation

This repository provides the Keras implementation of the paper "Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation" at the Computational Statistics and Data Analysis. This paper extends the paper accepted at the MIDL 2018. In case you want to cite this work, please cite the extended version.

In this repo, we demonstrate the proposed method using the two biomedical imaging segmentation datasets: the ISLES and the DRIVE datasets. For more detailed information, please see the ISLES and DRIVE.

I also strongly recommend to see the good implementation using the DRIVE dataset by Walter de Back. [notebook].

Example

Once you have a trained Bayesian neural network, the proposed uncertainty quantification method is simple !!! In a binary segmentaion, a numpy array p_hat with dimension (number of estimates, dimension of features), then the epistemic and aleatoric uncertainties can be obtained by the following code.

epistemic = np.mean(p_hat**2, axis=0) - np.mean(p_hat, axis=0)**2
aleatoric = np.mean(p_hat*(1-p_hat), axis=0)
  • Example of visual comparison between the proposed and Kendall and Gal (2017) methods Visual comparison between the proposed and Kendall and Gal (2017) methods

A directory tree

.
├── ischemic
│   ├── input
│   │   └── (train/test datasets)
│   └── src
│       ├── configs (empty)
│       ├── data.py
│       ├── models.py
│       ├── settings.py
│       ├── train.py
│       ├── utils.py
│       └── weights (empty)
├── README.md
└── retina
    ├── fig
    ├── input
    │   └── (train/test datasets)
    ├── model.py
    ├── UQ_DRIVE_stochastic_sample_2000.ipynb
    ├── UQ_DRIVE_stochastic_sample_200.ipynb
    ├── utils.py
    └── weights (empty)

References

  • ISLES website
  • DRIVE website
  • Oskar Maier et al. ISLES 2015 - A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI, Medical Image Analysis, Available online 21 July 2016, ISSN 1361-8415, http://dx.doi.org/10.1016/j.media.2016.07.009.
  • J.J. Staal, M.D. Abramoff, M. Niemeijer, M.A. Viergever, B. van Ginneken, "Ridge based vessel segmentation in color images of the retina", IEEE Transactions on Medical Imaging, 2004, vol. 23, pp. 501-509.
  • Kendall, Alex, and Yarin Gal. "What uncertainties do we need in bayesian deep learning for computer vision?." Advances in neural information processing systems. 2017.

Author

Yongchan Kwon, Ph.D. student, Department of Statistics, Seoul National University

uq_bnn's People

Contributors

ykwon0407 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

uq_bnn's Issues

Deriving the decomposition into aleatoric and epistemic uncertainty

Dear Mr. Kwon,

we enjoyed reading your papers on decomposing predictive variance into aleatoric and epistemic uncertainty in classification settings without the need of an extra output layer. Thank you also for sharing your code online.

After reading your derivation to decomposing the predictive variance as given in Appendix A in your paper "Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation", we had difficulties to understand the step from the second to last line to the next. I'm pasting the the two lines over here:

decomposition

Can you please clarify why the outer product (denoted as $\otimes 2$) can be moved outside of the difference? I think sum and outer products cannot be interchanged as demonstrated by the example below, but I might be missing some trick or assumption here.

outer_product_example

Thank you for your help!

about p_hat

Hi,
According to your code, p_hat is the matrix of all test data's probability vector after softmax, e.g.
[[0.3, 0.2, 0.5],
[0.1, 0.8, 0.1],
[0.4, 0.2, 0.4],
[0.7, 0.1, 0.2]] (assumed that there are 3 classes, two test data and the number of stochastic dropout is 2), and aleatoric and epistemic uncertainty are calculated according to p_hat, am I correct?
Thanks!

another question about p_hat

Why do you only use [0] of prediction results in line 63, /retina/utils.py? Shouldn't p_hat be made up of the probabilities of all classes?

About the eq.4

Could you please let me know whether the eq. 4 in the paper is applicable for multi label segmentation or just the binary segmentation?

Originally posted by @redsadaf in #1 (comment)

Questions about previous work implementation

Hi @ykwon0407 ,
Thank you for your interesting work!

I am working on your paper and Kendall's paper as a reference.
I was wondering how your suggestion and Kendall's suggestion were implemented, so I looked at the code. I have several questions about the implementation of Kendall's work.

In Kendall's paper, they proposed loss function for estimating uncertainty in classification task like below:
image

I found creating logit vector x in SampleNormal, but I'm curious about the implementation of loss function like below:

ll = K.mean(y_true_f * linear_predictor_f - K.log(1.+K.exp(linear_predictor_f)))

Can you explain K.log(1.+K.exp(linear_predictor_f))) ?
And you used log_variance and exponential function when estimating variance. Is there any reason of that?
In addition, Is there any inference code for kendall's work?

Thank you in advance!

a discussion about the inference method

Hi,
In your code, you use MC dropout when inference a new input's outputs. But recently I read another paper Bayesian Convolutional Neural Networks with Variational Inference, the author uses local reparameterization trick for convolutional layer to sample in inference (see line 132-146 in https://github.com/felix-laumann/Bayesian_CNN/blob/master/utils/BBBlayers.py), i.e. output = mean + std * (random(mean)), where (I think) w is drawn from N(mean, std). I can't tell which method is better for sampling, if we can have a discussion about this, I'll be very very thankful!! (By the way, he also introduces aleatoric and epistemic uncertainty in his work:) )

about uncertainties

Hi, can I rewrite your equations to get aleatoric uncertainty and epistemic uncertainty of each CLASS of the model, other than a single sample? I think it may show that the model is not good at some class or in some class, the model doesn't get enough good data.

about y*

Hi, in your appendix,
image
Why is y* one-hot encoded? Should y* be the probabilities of all classes? (e.g., there are 3 classes, then y* may be [0.1, 0.6, 0.3])

If it is a label, how to get it by average T times samples?
image

Thanks

inference code

Dear Yongchan,

I've been reading your paper and code with great interest. Seems like a very interesting way to assess predictive uncertainty in DNN models for segmentation, based purely on inference-time Dropout. However, I can't seem to find the code in which you do inference and actually compute the aleatoric and epistemic uncertainties.

Specifically, the implementation of the key eq. 4 in the paper seems to be missing. Could you provide this or point me to it?

Thanks for this interesting work!

Uncertainty Model

Uncertainty Model 관련한 논문 도움 많이 되었습니다 감사합니다 :)

the diag problem

Thank you for your sharing your code,
I have some questions.
In your paper, appendix A, For categorical variable y∗, Varp(y∗|x∗,ω)(y∗) = Ep(y∗|x∗,ω)(y∗⊗2)−Ep(y∗|x∗,ω)(y∗)⊗2 and it is Ep(y∗|x∗,ω){diag(y∗)}−Ep(y∗|x∗,ω)(y∗)⊗2 because y∗ is one-hot encoded.

However,
You known, the output of softmax is not one-hot encoded variable, each position is the probability of the classes , for example,
the output is [0.9,0.1,0.1](not one-hot), the ground trouth is 1,0,0, so do we really can use the "it is Ep(y∗|x∗,ω){diag(y∗)}−Ep(y∗|x∗,ω)(y∗)⊗2 because y∗ is one-hot encoded. "?

Feng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.