1 - Is there any quick way to do this with corner? if not, then its unclear how to use

Normalising the y axis of the 1D hist Pdfs diagonals to unity! about corner.py HOT 25 CLOSED

dfm commented on July 24, 2024

Normalising the y axis of the 1D hist Pdfs diagonals to unity!

from corner.py.

Comments (25)

drphilmarshall commented on July 24, 2024 2

I think if we are talking about samples from a PDF, then you always want to
normalize each histogram to 1. However, if you want to use corner to
visualize number density rather than probability density then I can see
how one might want to specify the relative normalizations of the datasets
being overlaid. I have never needed to normalize to equal peak height...

On Wed, Sep 28, 2016 at 2:04 PM, Dan Foreman-Mackey <
[email protected]> wrote:

Let's get some more details – what are the samples that you're comparing
and why is your suggestion better for comparison? If the histograms are
properly normalized then a wider distribution will also be "shorter". I
think that this actually makes the visualization clearer! I expect that
this is also the same reason why neither matplotlib or numpy have native
support for this.

If you want to mock up a change, it will be easiest to do with weights =
np.ones(n) and modifying this line
https://github.com/dfm/corner.py/blob/master/corner/corner.py#L248:

y0 = np.array(list(zip(n, n))).flatten()

to

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a
misleading result!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY91jVkoaSgzQNTvcM__98B9ZeoavKks5qutZYgaJpZM4KIAYE
.

from corner.py.

dfm commented on July 24, 2024 1

The first thing that we need is a convincing argument for why you want this
feature. I'm currently skeptical that it's actually something that we want
and I'm hesitant to add features that might be misleading so I'd love to
hear the specific use case and the story that you're trying to tell.
On Wed, Sep 28, 2016 at 12:54 PM Sultan Hassan [email protected]
wrote:

Cool I would be very much happy to contribute and modify the code as this
routine + emcee already have been providing a great help in my research,
many thanks to the owner.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAVYSpEQ5rChFMnlO1i3lVkcEk8OM0-Kks5qusXJgaJpZM4KIAYE
.

from corner.py.

kbarbary commented on July 24, 2024

Regarding weights, a single weight applies to a single sample (a single sample being a position in an n-dimensional space). So, weights[i] is the weight for the sample samples[i, :]. It wouldn't make sense to have different weights for different dimensions of a single sample.

from corner.py.

sultan-hassan commented on July 24, 2024

Thank you very much for that. So how do I creat such a weight to normalize the y axis of the diagonal 1D PDFs for my sample.
Lets assume that my sample has a shape of (1000, 3) so the corresponding wieghts should be (1000), right? so I am confused on how to construct such a weight to do the normalization.

from corner.py.

kbarbary commented on July 24, 2024

I'm a bit confused: I thought the y axis of the diagonals are not labelled anyway, so I don't see what normalizing these 1-d PDFs would do (other than perhaps change the y axis tick locations, which don't mean much).

Maybe you're using an option to get y labels on the diagonals, or maybe defaults have changed since I last looked at it?

from corner.py.

sultan-hassan commented on July 24, 2024

Let me explain more. I am over plotting two samples on top of each other so without normalizing, you cant clearly see the pdfs of the two samples on the 1D pdf hist diagonals.
If you see the attached plot, I want to normalize the red and blue so then I can have them in the same length.

from corner.py.

kbarbary commented on July 24, 2024

Ah, I didn't understand that you were plotting two sets of samples.

Dan will know better, but I think passing a weight array will indeed affect the relative scaling of the two sets of samples. Assuming you have 1000 samples, try passing 2.0 * np.ones(1000) for weights for one set and np.ones(1000) for the other set and see if it changes the relative scaling.

from corner.py.

sultan-hassan commented on July 24, 2024

Yes it does change scaling but it would be great to find out something consistent to use for all samples such that the sum under the PDF is equal to one or any way to normalize them without playing randomly, 2.0 * np.ones(1000) if not then 3.0 * np.ones(1000) ...etc. thanks for helping out.

from corner.py.

kbarbary commented on July 24, 2024

Is the problem that the two sets have different numbers of samples? If so, setting weights=np.ones(nsamples)/nsamples for each set should make the areas under the PDF the same regardless of the value of nsamples.

from corner.py.

sultan-hassan commented on July 24, 2024

The nsmaples is the same! and I tried this but it doesn't help. There must be a way :( I will keep playing around, let me know if you find something.

from corner.py.

dfm commented on July 24, 2024

If nsamples is the same then it is normalized! The integral under each of those histograms is the same.

from corner.py.

drphilmarshall commented on July 24, 2024

Agreed - the blue and red histograms look like good approximations to
normalized PDFs to me!

On Wed, Sep 28, 2016 at 8:57 AM, Dan Foreman-Mackey <
[email protected]> wrote:

If nsamples is the same then it is normalized! The integral under each
of those histograms is the same.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY9xsEl0RhvOGsOR6s8_o2moi4Nf05ks5quo54gaJpZM4KIAYE
.

from corner.py.

sultan-hassan commented on July 24, 2024

So If I want the height of the 1D hist diagonals (blue and red) to be the same? should I incearse the bins number or I should use the same bin-width? It should be possible to have them in the same height! the question is how? still playing around...

from corner.py.

drphilmarshall commented on July 24, 2024

I think the default behavior should be that the histograms should be represented so that they appear as approximations to PDFs, which is to say, each normalised to unit area under the curve - no matter what the chosen bin size is. Asking for the distributions to be shown with equal peak heights sounds like different functionality, but which could be provided via a (new?) kwarg, like `equalize_peak_heights=True`. BTW if such an option were implemented, you'd want the 2D histograms to be scaled to equal peak density in the same way (otherwise the panels would become logically disconnected.) A more common use case might be to use a (new? and easier to implement?) kwarg like `normalization=1512`, so that different populations can be overlaid with their appropriate relative (or absolute) sizes (in this case, 1512 population members). The default would be 1.0. I only ever use `corner` for samples from PDFs, but maybe others have done this?

from corner.py.

sultan-hassan commented on July 24, 2024

Great thanks for that :). Could you maybe point out the link to those functions 'equalize_peak_heights=True' and 'normalization=1512', are these kwargs for hist OR hist2d?

from corner.py.

drphilmarshall commented on July 24, 2024

Those were suggestions for a pull request, if they do not exist already. The API docs do not show any options like that yet, so you could be about to become a contributor! :-) Dan can advise more on exactly where to place your energies, so if I were you I'd wait for a reply from him - but after that, I'd start extending the source code to see if I could make the plot I wanted, in a way that would enable others to follow suit.

from corner.py.

sultan-hassan commented on July 24, 2024

Cool I would be very much happy to contribute and modify the code as this routine + emcee already have been providing a great help in my research, many thanks to the owner.

from corner.py.

sultan-hassan commented on July 24, 2024

Well, the only reason is that a much more better visualisation in comparing different samples in terms of the distribution shape and width. However, if this doesnt seem useful to be a part of the routine, then thats fine with me anyway. But I would still like to know how to do such a thing for my self. Any ideas?

from corner.py.

dfm commented on July 24, 2024

Let's get some more details – what are the samples that you're comparing and why is your suggestion better for comparison? If the histograms are properly normalized then a wider distribution will also be "shorter". I think that this actually makes the visualization clearer! I expect that this is also the same reason why neither matplotlib or numpy have native support for this.

If you want to mock up a change, it will be easiest to do with weights = np.ones(n) and modifying this line:

y0 = np.array(list(zip(n, n))).flatten()

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a misleading result!

from corner.py.

dfm commented on July 24, 2024

Totally! The current default behavior is actually to normalize to the number density and you can add hist_args=dict(normed=True) to get the PDF behavior.

from corner.py.

drphilmarshall commented on July 24, 2024

That is good to know indeed! I went back and checked the API docs: this
default behavior is not explained anywhere, but perhaps it should. Also, to
get the PDF behavior do you need to specify hist2d_args=dict(normed=True)
as well? I can't think of any reason you would want to normalize the 2D and
1D histograms differently, can you? Maybe we need a normed=True kwarg
in corner.corner, that turns on both 1 and 2D normalizations?

In the meantime, Sultan, it sounds as though you can get the behavior you
want through judicious choice of sample sizes...

from corner.py.

dfm commented on July 24, 2024

I agree that it's worth saying something about the default behavior in the docs – I am actually inclined to change the default to density normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms. The contours are always at percentiles of the sample mass. I guess you could choose to have the contours defined in terms of numbers but that's some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior by changing the sample size because the peak height in each panel actually depends on the bin sizes and the shape of the distribution. That's the whole reason why it's meaningless to give the "peaks" equal heights!

from corner.py.

drphilmarshall commented on July 24, 2024

Yeah - the word "judicious" can cover a lot of fiddling around... :-)

I'd support a move to probability density normalization by default,
especially since the contour levels are defined in terms of probability
mass! However if someone really was trying to visualize number density, I
guess they might want contours in absolute number density, but I agree
it's better to wait for that to be requested... In the meantime they could
still have the 2D grayscale, 2D scatter plot and 1D histograms all
represent absolute number density (which I think is the current default).
I bet they woudl still find it useful to be able to switch easily from
"number density" to "probabilty density" and back, though.

On Wed, Sep 28, 2016 at 4:35 PM, Dan Foreman-Mackey <
[email protected]> wrote:

I agree that it's worth saying something about the default behavior in the
docs – I am actually inclined to change the default to density
normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms.
The contours are always at percentiles of the sample mass. I guess you
could choose to have the contours defined in terms of numbers but that's
some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior
by changing the sample size because the peak height in each panel actually
depends on the bin sizes and the shape of the distribution. That's the
whole reason why it's meaningless to give the "peaks" equal heights!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY96UgWG2yOWh3CblJX2xpOSYQw_3oks5quvmxgaJpZM4KIAYE
.

from corner.py.

sultan-hassan commented on July 24, 2024

Well, here's plot taken from Greig+Mesinger2015 21cmmc paper where the plot shows different Pdfs with equal heights. I thought this is a good representation to compare different pdfs and might be able to the same with corner....

from corner.py.

jtlz2 commented on July 24, 2024

normed is now deprecated in matplotlib - rather use density.

from corner.py.

Normalising the y axis of the 1D hist Pdfs diagonals to unity! about corner.py HOT 25 CLOSED

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent