Comments (2)
The shape of a histograms depends on how the bins are defined. You can see in the lets-plot
graph that the left bin starts at around -2.5 and the right bin ends at about 40. For the pandas plot, the bins range from 0 to about 38. While it hardly changes the overall shape, the distribution of cases into the first two bins changes with the observed reversal in the column heights.
If I compare the two graphs, I would say that the bins that lets-plot
chooses are not appropriate for the data set. If you look at the lets-plot
graph, the left most bin from -2.5 to 2.5, the population count is however only a positive number, so half of the bin range is not representing data.
The documentation for geom_histogram
(https://lets-plot.org/pages/api/lets_plot.geom_histogram.html) shows that there are keyword arguments that allow to control the positioning of the bins. Try using for example boundary=0
and see how that changes the overall shape.
If the datasets aren't too large, I like to use density plots. They often give a more informative picture of distributions.
Examples:
from practical-statistics-for-data-scientists.
Hi @gedeck, thanks for replying to the issue so quickly. Oh dear, how careless I am to miss that detail :( . I will close this issue since it's just my code error.
from practical-statistics-for-data-scientists.
Related Issues (20)
- Errors and Questions in Ch5, 6, 7 HOT 3
- Again in Ch 5, 6, 7 HOT 3
- Incorrect variable reference Chi2 (Chapter 3 page 127) HOT 1
- Ch 3. Line 77 in Python Code HOT 2
- Ch. 2 - R Code Data and Sampling Distributions Lines 35, 36 HOT 1
- Pull request HOT 16
- Python Jupyter Notebook program output is different from what is shown there HOT 2
- Python code for Chapter 3 - Web Stickness - TypeError in the original code HOT 5
- perm_fun use of set() HOT 2
- Possible Considerations on moving R into conda environment for consistency HOT 3
- Enable github CI for pull requests
- Add R build to CI HOT 3
- ζ°΄ζΈγγ
- Resampling in chi square test HOT 1
- Adjust code to changes in Python packages
- Figure 7.1 (Python) - Broken HOT 3
- Anaconda - ResolvePackageNotFound HOT 2
- Statistics
- chi-square, resampling approach HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from practical-statistics-for-data-scientists.