jackbenn / graphing_tools Goto Github PK
View Code? Open in Web Editor NEWAssorted tools I've written for general graphing
Assorted tools I've written for general graphing
Create a plot that shows the KDE together with some confidence interval for the density at each point, so there's a minimum and maximum curve as well as the base curve. Presumably it would have a greater range in low-density areas, though also the overall distance between the upper and lower bounds would be a good measure of the uncertainty.
Add an option to encourage the bins to line up on integers, powers of tens, or (less frequently) small powers or two/five times those. There's a tradeoff between getting the right number of bins and minimize the power of two/five.
Possible multiples of powers of 10:
0: 1
1: 2, 5
2: 2.5, 4
3: 1.25, 8
4: 1.6, 6.25
5: 3.125, 3.2
We then choose the step size that minimizes the sum of
alpha times the number of steps, plus
the square of the log of the ratio between the requested number of bins and the number of bins using that ratio
Not sure of the proper value of alpha yet, that will take experimentation, and maybe it will be adjustable. Likely the default will prevent anything past 1 or 2, and if adjustable it will probably be capped at 5.
Add options for best-fit line. Consider all the statsmodels options (https://www.statsmodels.org/stable/generated/statsmodels.graphics.gofplots.qqplot.html) in particular, the 45-degree line and regression line, and also a PCA line (since linear-regression treats the horizontal and vertical axes differently).
The histogram should be able to do bounded kdes, either on one side or both. It might have options between reflection and transformation, though there may be a better option. Possible parameters:
kde_bounds: tuple of lower and upper bounds, with None meaning that side is unbonded.
kde_bounding: 'reflect', 'transform'
Numpy has some algorithms for choosing bins. Matplotlib uses them, but multihist doesn't recognize them.
https://numpy.org/devdocs/reference/generated/numpy.histogram_bin_edges.html
Also: it have an option to detect discrete values and just plot those as a bar chart.
The QQ and PP plots handle distributions, but not the CDF plots or the QP matrix.
As part of that, the various plots should change scatter->plot to allow a line between points (appropriate for dist vs dist plots).
The probability plots aren't quite aligned right. In particular, the spacing on matching distributions should probably be at the midpoints of the bins, and the probability plot shouldn't use a strict inequality.
The CDF would be more useful with quantile lines (horizontal to the left of the curve, vertical under the curve)
Also it should match multihist better in its options.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.