Giter Club home page Giter Club logo

Comments (25)

drphilmarshall avatar drphilmarshall commented on July 24, 2024 2

I think if we are talking about samples from a PDF, then you always want to
normalize each histogram to 1. However, if you want to use corner to
visualize number density rather than probability density then I can see
how one might want to specify the relative normalizations of the datasets
being overlaid. I have never needed to normalize to equal peak height...

On Wed, Sep 28, 2016 at 2:04 PM, Dan Foreman-Mackey <
[email protected]> wrote:

Let's get some more details – what are the samples that you're comparing
and why is your suggestion better for comparison? If the histograms are
properly normalized then a wider distribution will also be "shorter". I
think that this actually makes the visualization clearer! I expect that
this is also the same reason why neither matplotlib or numpy have native
support for this.

If you want to mock up a change, it will be easiest to do with weights =
np.ones(n) and modifying this line
https://github.com/dfm/corner.py/blob/master/corner/corner.py#L248:

y0 = np.array(list(zip(n, n))).flatten()

to

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a
misleading result!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY91jVkoaSgzQNTvcM__98B9ZeoavKks5qutZYgaJpZM4KIAYE
.

from corner.py.

dfm avatar dfm commented on July 24, 2024 1

The first thing that we need is a convincing argument for why you want this
feature. I'm currently skeptical that it's actually something that we want
and I'm hesitant to add features that might be misleading so I'd love to
hear the specific use case and the story that you're trying to tell.
On Wed, Sep 28, 2016 at 12:54 PM Sultan Hassan [email protected]
wrote:

Cool I would be very much happy to contribute and modify the code as this
routine + emcee already have been providing a great help in my research,
many thanks to the owner.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAVYSpEQ5rChFMnlO1i3lVkcEk8OM0-Kks5qusXJgaJpZM4KIAYE
.

from corner.py.

kbarbary avatar kbarbary commented on July 24, 2024

Regarding weights, a single weight applies to a single sample (a single sample being a position in an n-dimensional space). So, weights[i] is the weight for the sample samples[i, :]. It wouldn't make sense to have different weights for different dimensions of a single sample.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Thank you very much for that. So how do I creat such a weight to normalize the y axis of the diagonal 1D PDFs for my sample.
Lets assume that my sample has a shape of (1000, 3) so the corresponding wieghts should be (1000), right? so I am confused on how to construct such a weight to do the normalization.

from corner.py.

kbarbary avatar kbarbary commented on July 24, 2024

I'm a bit confused: I thought the y axis of the diagonals are not labelled anyway, so I don't see what normalizing these 1-d PDFs would do (other than perhaps change the y axis tick locations, which don't mean much).

Maybe you're using an option to get y labels on the diagonals, or maybe defaults have changed since I last looked at it?

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Let me explain more. I am over plotting two samples on top of each other so without normalizing, you cant clearly see the pdfs of the two samples on the 1D pdf hist diagonals.
If you see the attached plot, I want to normalize the red and blue so then I can have them in the same length.

figure_2

from corner.py.

kbarbary avatar kbarbary commented on July 24, 2024

Ah, I didn't understand that you were plotting two sets of samples.

Dan will know better, but I think passing a weight array will indeed affect the relative scaling of the two sets of samples. Assuming you have 1000 samples, try passing 2.0 * np.ones(1000) for weights for one set and np.ones(1000) for the other set and see if it changes the relative scaling.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Yes it does change scaling but it would be great to find out something consistent to use for all samples such that the sum under the PDF is equal to one or any way to normalize them without playing randomly, 2.0 * np.ones(1000) if not then 3.0 * np.ones(1000) ...etc. thanks for helping out.

from corner.py.

kbarbary avatar kbarbary commented on July 24, 2024

Is the problem that the two sets have different numbers of samples? If so, setting weights=np.ones(nsamples)/nsamples for each set should make the areas under the PDF the same regardless of the value of nsamples.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

The nsmaples is the same! and I tried this but it doesn't help. There must be a way :( I will keep playing around, let me know if you find something.

from corner.py.

dfm avatar dfm commented on July 24, 2024

If nsamples is the same then it is normalized! The integral under each of those histograms is the same.

from corner.py.

drphilmarshall avatar drphilmarshall commented on July 24, 2024

Agreed - the blue and red histograms look like good approximations to
normalized PDFs to me!

On Wed, Sep 28, 2016 at 8:57 AM, Dan Foreman-Mackey <
[email protected]> wrote:

If nsamples is the same then it is normalized! The integral under each
of those histograms is the same.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY9xsEl0RhvOGsOR6s8_o2moi4Nf05ks5quo54gaJpZM4KIAYE
.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

So If I want the height of the 1D hist diagonals (blue and red) to be the same? should I incearse the bins number or I should use the same bin-width? It should be possible to have them in the same height! the question is how? still playing around...

from corner.py.

drphilmarshall avatar drphilmarshall commented on July 24, 2024

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Great thanks for that :). Could you maybe point out the link to those functions 'equalize_peak_heights=True' and 'normalization=1512', are these kwargs for hist OR hist2d?

from corner.py.

drphilmarshall avatar drphilmarshall commented on July 24, 2024

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Cool I would be very much happy to contribute and modify the code as this routine + emcee already have been providing a great help in my research, many thanks to the owner.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Well, the only reason is that a much more better visualisation in comparing different samples in terms of the distribution shape and width. However, if this doesnt seem useful to be a part of the routine, then thats fine with me anyway. But I would still like to know how to do such a thing for my self. Any ideas?

from corner.py.

dfm avatar dfm commented on July 24, 2024

Let's get some more details – what are the samples that you're comparing and why is your suggestion better for comparison? If the histograms are properly normalized then a wider distribution will also be "shorter". I think that this actually makes the visualization clearer! I expect that this is also the same reason why neither matplotlib or numpy have native support for this.

If you want to mock up a change, it will be easiest to do with weights = np.ones(n) and modifying this line:

y0 = np.array(list(zip(n, n))).flatten()

to

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a misleading result!

from corner.py.

dfm avatar dfm commented on July 24, 2024

Totally! The current default behavior is actually to normalize to the number density and you can add hist_args=dict(normed=True) to get the PDF behavior.

from corner.py.

drphilmarshall avatar drphilmarshall commented on July 24, 2024

That is good to know indeed! I went back and checked the API docs: this
default behavior is not explained anywhere, but perhaps it should. Also, to
get the PDF behavior do you need to specify hist2d_args=dict(normed=True)
as well? I can't think of any reason you would want to normalize the 2D and
1D histograms differently, can you? Maybe we need a normed=True kwarg
in corner.corner, that turns on both 1 and 2D normalizations?

In the meantime, Sultan, it sounds as though you can get the behavior you
want through judicious choice of sample sizes...

from corner.py.

dfm avatar dfm commented on July 24, 2024

I agree that it's worth saying something about the default behavior in the docs – I am actually inclined to change the default to density normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms. The contours are always at percentiles of the sample mass. I guess you could choose to have the contours defined in terms of numbers but that's some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior by changing the sample size because the peak height in each panel actually depends on the bin sizes and the shape of the distribution. That's the whole reason why it's meaningless to give the "peaks" equal heights!

from corner.py.

drphilmarshall avatar drphilmarshall commented on July 24, 2024

Yeah - the word "judicious" can cover a lot of fiddling around... :-)

I'd support a move to probability density normalization by default,
especially since the contour levels are defined in terms of probability
mass! However if someone really was trying to visualize number density, I
guess they might want contours in absolute number density, but I agree
it's better to wait for that to be requested... In the meantime they could
still have the 2D grayscale, 2D scatter plot and 1D histograms all
represent absolute number density (which I think is the current default).
I bet they woudl still find it useful to be able to switch easily from
"number density" to "probabilty density" and back, though.

On Wed, Sep 28, 2016 at 4:35 PM, Dan Foreman-Mackey <
[email protected]> wrote:

I agree that it's worth saying something about the default behavior in the
docs – I am actually inclined to change the default to density
normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms.
The contours are always at percentiles of the sample mass. I guess you
could choose to have the contours defined in terms of numbers but that's
some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior
by changing the sample size because the peak height in each panel actually
depends on the bin sizes and the shape of the distribution. That's the
whole reason why it's meaningless to give the "peaks" equal heights!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY96UgWG2yOWh3CblJX2xpOSYQw_3oks5quvmxgaJpZM4KIAYE
.

from corner.py.

sultan-hassan avatar sultan-hassan commented on July 24, 2024

Well, here's plot taken from Greig+Mesinger2015 21cmmc paper where the plot shows different Pdfs with equal heights. I thought this is a good representation to compare different pdfs and might be able to the same with corner....
screen shot 2016-10-11 at 5 53 13 pm

from corner.py.

jtlz2 avatar jtlz2 commented on July 24, 2024

normed is now deprecated in matplotlib - rather use density.

from corner.py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.