gedeck / mistat-code-solutions Goto Github PK

Code repository for "Modern Statistics: A Computer Based Approach with Python" and "Industrial Statistics: A Computer Based Approach with Python"

Jupyter Notebook 99.06% Python 0.90% Makefile 0.01% SCSS 0.01% HTML 0.03%

python statistics

mistat-code-solutions's People

Contributors

Stargazers

Watchers

mistat-code-solutions's Issues

Add information on how to create a pywash environment with docker

Clarify Granger causality test

Describe the problem
The Granger causality test requires clarification why p-values are greater than the significance level. Statsmodels documentation states:

The Null hypothesis for grangercausalitytests is that the time series in the second column, x2, does NOT Granger cause the time series in the first column, x1. Grange causality means that past values of x2 have a statistically significant effect on the current value of x1, taking past values of x1 into account as regressors. We reject the null hypothesis that x2 does not Granger cause x1 if the pvalues are below a desired size of the test.

Suggested change
A clear and concise description of what you expected to happen.

Be more specific about required packages to install

Modern Statistics: PMF vs PDF

Describe the problem

Chapter 2, Page 55, section 2.2.1.1. The bolded words after equation 2.13: "probability distribution function," shouldn't that be called "probability mass function"?

MS-chapter 3: qqplot now creates only 5 lines

Change to code to create figures.

np.random.seed(1)
x = stats.norm(loc=10, scale=1).rvs(50)
fig, ax = plt.subplots(figsize=[5, 5])
pg.qqplot(x, ax=ax)
ax.get_lines()[0].set_color('grey')
ax.get_lines()[0].set_markerfacecolor('none')
ax.get_lines()[1].set_color('black')
ax.get_lines()[2].set_color('grey')
ax.get_lines()[3].set_color('grey')
plt.show()

Python: update dtreeviz dependent code to version 2.0

Version 2.0 caused breaking changes to the code

Industrial Statistics: Require new solution for kriging code

Describe the problem
The pylibkriging package causes issues during installation. Find an alternative.

Industrial Statistics: Add notebook to install all packages

Describe the problem
File is missing for Industrial statistics

Consider adding --user to the pip install commands

Modern Statistics: Se clarification

Chapter 4, page 251, states that The square roots of these variances estimates are the "std err".... The Se value is shown in the regression summary output as Scale.
Page 250 states that Se^2 = 5.8869
Page 248 in the results summary states that Scale: 5.8832
Suspect there is rounding error, but none-the-less, the sqrt of 5.8832 = 2.426. Thus, Se =/= Scale

Problem:
The Scale output cannot be both variance and standard error.

https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.scale.html
Note that the square root of scale is often called the standard error of the regression.

Solution:
The square roots of these variances estimates are the "std err".... The Se^2 value is shown in the regression summary output as Scale.

Modern Statistics: Chapter 8 code

Describe the problem
Chapter 8 has deprecation warnings and failures.

Suggested change
Check and fix

Modern Statistics: pandas iteritems deprecated

Describe the problem
The pandas method iteritems is deprecated which causes the code in Example 7.3 to fail.

Suggested change

  df = pd.DataFrame([
    {satisfaction: counts for satisfaction, counts
      in response.value_counts().iteritems()},
    {satisfaction: counts for satisfaction, counts
      in response[q1_5].value_counts().iteritems()},
  ])

with

  df = pd.DataFrame([
    {satisfaction: counts for satisfaction, counts
      in response.value_counts().items()},
    {satisfaction: counts for satisfaction, counts
      in response[q1_5].value_counts().items()},
  ])

Add information on how to install pywash using docker

Modern Statistics: ...

Describe the problem
Ask Springer to change this line on their website:

The mistat Python package can be accessed at https://gedeck.github.io/mistat-code-solutions/ModernStatistics/

Link should be clickable and should reference as source for code and solutions.

Suggested change
???

Modern Statistics: ...

Describe the problem
Chapter 3 Page 162 Equation 3.30

The lower interval's denominator missing "/2" in the subscript of the Chi-square symbol. Should appear as: Chi-Square 1-a/2[n-1].

Chapter 3 page 152

Describe the problem
The P-value explained at the top of page 152 but there is no reference to it in the index

Suggested change
Add "p. 152" to the P-value entry in the index on page 437.

Screenshots
If applicable, add screenshots to help explain your problem.

Software (if the problem is with the code):

OS: [e.g. iOS]
Python version [e.g. 3.10]

Additional context
Add any other context about the problem here.

Remove Latex formatting from code

latex=True
table formatting

Modern Statistics: Updates required

Describe the problem
DtreeViz has a changed API

Suggested change
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Software (if the problem is with the code):

OS: [e.g. iOS]
Python version [e.g. 3.10]

Additional context
Add any other context about the problem here.

Improve information about installing graphviz on Windows

https://iotespresso.com/how-to-install-graphviz-on-windows/

Industrial Statistics: Find alternative to Gaussian Process models

Describe the problem
The package used in the book seems to have low support. Check alternative GP models.

Add notebook that installs required packages

Industrial Statistics: Prepare new zip files and make sure that

Describe the problem
Code has changed. New zip files are required. Also update the Install_packages notebook.

Modern Statistics: statsmodels API for anova has changed

Describe the problem
Error below when using mistat.stepwise_regression(outcome=y, all_vars=X, data=df3)

Software Error
File ~\mistat\regression\stepwiseRegression.py:19, in find_best_model_partialF(outcome, variable_sets, data, old_model, opt_max)
17 with warnings.catch_warnings():
18 warnings.simplefilter("ignore")
---> 19 comparison = sms.anova.anova_lm(old_model, new_model)
20 if optF * partialF < optF * comparison.F[1]:
21 best_vars = variables

AttributeError: module 'statsmodels.stats' has no attribute 'anova'

Suggested change
import statsmodels.api as sm
---> 19 comparison = sm.stats.anova_lm(old_model, new_model)

Software (if the problem is with the code):
OS: Windows
Release: 10
Python implementation: CPython
Python version : 3.11.0
IPython version : 8.12.0
statsmodels: 0.14.0

Industrial Statistics: Minor typo in Example 9.17

Describe the problem
Example 9.17 Using the censored data from Exercise 9.16, we estimate the ....

should be:

Example 9.17 Using the censored data from Example 9.16, we estimate the ....

Errata page

Modern Statistics: section 3.4, correct confidence level in equation 3.30

Describe the problem
Equation 3.30 is incorrect

Suggested change
The term $X_{1-\alpha}^2[n-1]$ needs to be changed to $X_{1-\alpha/2}^2[n-1]$

Special layout for each books

Make it clearer which book each directory refers to.

Industrial Statistics: Clarify interaction plot

Describe the problem
Improve description of interaction plot figure.

Hints on how to deal with Errors and Warnings

Add information at the top of the files on what to do when Errors or Warnings occur.

Warnings can usually be ignored
Errors - google for the error and recommend fixes
Check the repository for updated code

Include date of last change in notebook exports

Add date of last change to notebook exports
Find way of having a last update date on the github.io pages

Industrial Statistics: Make binder more prominent

Make binder link similar to Modern Statistics

Add links to Google colab

Add links like:

https://colab.research.google.com/github/gedeck/mistat-code-solutions/blob/main/IndustrialStatistics/notebooks/Chap002.ipynb

Example:
https://github.com/UVADS/DS1001/blob/master/ddsbook/analytics-lab-III.ipynb

Will require adding pip install statements at the start

Industrial Statistics: p. 135 description of constraints for matrix W

Describe the problem
Error on page 135 in definition of W matrix constraints

Suggested change

Modern Statistics: Ch. 6 pg. 334 code inconsistency.

For model_3 object in the code block, smf.ols() function formula argument has the wrong formula object referenced. Should read formula instead of poly_formula or vice versa.

Looking at the notebook at the following link: https://github.com/gedeck/mistat-code-solutions/blob/main/ModernStatistics/notebooks/Chap006.ipynb, it appears poly_formula is the selected object name.

This is just a object labeling inconsistency issue in the textbook.

File with all data sets

Both books

Industrial Statistics: Add installpackages.ipynb to zip file

Describe the problem
A clear and concise description of what the problem is.

Suggested change
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Software (if the problem is with the code):

OS: [e.g. iOS]
Python version [e.g. 3.10]

Additional context
Add any other context about the problem here.

InstallPackages.ipynb for Industrial Statistics

Review all notebooks

Some code breaks. Review all notebooks.

Modern Statistics: Proportional Sample Allocation Ch. 5 pg 317 Equation Clarification

Proportional Sample Allocation Ch. 5 pg 317 Equation Clarification

The equation on page 317 in the top section under V$_{\text{simple}}$ ...

where
$\tilde{\sigma}_N^2 = \frac{N}{N-1}\sigma_N^2$

Should it read?:
$\tilde{\sigma}_{N_i}^2 = \frac{N_i}{N_i-1}\sigma_{N_i}^2$

I get the sense that it should because of how the equation for $\bar{\sigma}_N^2$ on page 318 is solved.

Add page: Where to go from here

Add a page to guide readers on what they can do next (see predictive analytics FAQ for a start)

Github copilot
Chat-GPT2

Modern Statistics: Check for warnings during tests and schedule running tests

Describe the problem
Deprecation warnings are not identified during tests. See if we can add code to identify warnings to address early on.

gedeck / mistat-code-solutions Goto Github PK

mistat-code-solutions's People

Contributors

Stargazers

Watchers

Forkers

mistat-code-solutions's Issues

Recommend Projects

Recommend Topics

Recommend Org