Giter Club home page Giter Club logo

dsc-2-20-18-section-recap-online-ds-sp-000's Introduction

Section Recap

Introduction

This short lesson summarizes the topics we covered in section 20 and why they'll be important to you as a data scientist.

Objectives

You will be able to:

  • Understand and explain what was covered in this section
  • Understand and explain why this section will help you become a data scientist

Key Takeaways

Some of the key takeaways from this section include:

  • It's important to have a sound approach to experimental design to be able to determine the significance of your findings
  • Start by examining any existing research to see if it can shed light on the problem you're studying
  • Start with a clear alternative and null hypothesis for your experiment to "prove"
  • It's important to have a thoughtfully selected control group from the same population for your trial to distinguish effect from variations based on population, time or other factors
  • Sample size needs to be selected carefully to ensure your results have a good chance of being statistically significant
  • Your results should be reproducible by other people and using different samples from the population
  • The p-value for an outcome determines how likely it is that the outcome could be due to chance
  • The alpha value is the marginal threshold at which we're comfortable rejecting the null hypothesis
  • An alpha of 0.05 is a common choice for many experiments
  • Effect size measures just the size in difference between two groups under observation, whereas statistical significance combines effect size with sample size
  • A one sample t-test is used to determine whether a sample comes from a population with a specific mean.
  • A two-sample t-test is used to determine if two population means are equal
  • Type 1 errors (false positives) are when we accept an alternative hypothesis which is actually false
  • The alpha that we pick is the likelihood that we will get a type 1 error due to random chance
  • Type 2 errors (false negatives) are when we reject an alternative hypothesis which is actually true
  • The beta that we pick is the likelihood that we will get a type 2 error due to random chance
  • The power of a statistical test is defined as the probability of rejecting the null hypothesis, given that it is indeed false
  • You should avoid swimming pools any year that Nicolas Cage appears in multiple films (kidding!)
  • Spurious correlation is always a risk - but particularly when comparing multiple variables
  • It's important to use corrections such as the Bonferroni Correction to deal with multiple comparisons
  • Goodhart's law: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." --Charles Goodhart
  • ANOVA (Analysis of Variance) tests the hypothesis that the means of two or more populations are equal

dsc-2-20-18-section-recap-online-ds-sp-000's People

Contributors

peterbell avatar loredirick avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.