Comments (6)
Agreed. Should we use the nycflights13
data? It's a good one for a lab that doesn't involve inference.
from oilabs-base-r.
You'd have to change the Normal distribution lab as well. And I feel the data set is fine, just choose a different outcome variable perhaps.
from oilabs-base-r.
@norcalbiostat we could use different datasets for the two labs though, so I don't think we need to feel limited to variables that are normally distributed for the intro to data lab.
from oilabs-base-r.
But I think @norcalbiostat 's point is the the body dimensions data in the Normal Distribution lab has the same problem.
from oilabs-base-r.
I should admit I haven't used the normal distribution lab in a while, so I should first correct myself - the two labs don't use the same dataset anyway.
I feel like the issue with the intro to data lab is the wdiff
variable, that we then compare between men and women. The normal distribution lab compares heights, briefly, but beyond that doesn't go into comparing peoples' desired weights, so perhaps it's a bit more factual and bit less about body image?
I'm completely on board with changing the dataset for the intro to data lab, as I think that lab can be enhanced to be more about data wrangling skills (in addition to resolving the issue @beanumber raised). And I'm also on board with changing the data in the normal distribution lab because it's not that exciting (likely the reason why I haven't been doing that lab lately...). But if we're prioritizing, it seems like intro to data lab might have a more urgent issue to be addressed.
from oilabs-base-r.
I'm all for refreshing data sets, but the challenge is always finding a replacement that is better. And there's often that unfortunate trade-off between data that clearly illustrate a statistical principle and data that is most interesting (please oh please, let us find a population level data set so we can replace the ames data).
I think a data wrangling lab based on the nycflights13
would be terrific. It has heterogeneous data types and is interesting enough to naturally motivate several different questions and analyses. It also has that nice opportunity to define on-time performance in multiple ways, so it's an improvement on wdiff
that way. If this lab were to replace lab 1, it's important that it cover some of the key points of chapter 1. It could also be cool to have it go off on it's own data sciency direction, but then it's probably work best as an extra lab.
If I remember correctly, the main thing in favor of the bdims
data set is that it's a collection of continuous variables that exhibit a mix of symmetric and skewed distributions. I think we should keep our eyes out for a more interesting replacement, but I have nothing on hand right now.
from oilabs-base-r.
Related Issues (20)
- Lab 4B Second Look (not Andrew) HOT 1
- Lab 6 Second Look (not Andrew) HOT 1
- Lab 7 conform style HOT 1
- Lab 8 conform style HOT 1
- Add bolded OYO numbers to lab.css HOT 1
- Lab 5 second look (not Andrew HOT 1
- Link in Lab 4B points to outdated PDF of Lab 4A on OpenIntro site HOT 2
- Weirdness with lists in Model Diagnostics section of Lab 7 HOT 1
- Second look at Lab 7 (not Ben)
- Second look at lab 8 (not Ben)
- Why not provide access to full data on NC births? HOT 1
- Incongruous description of loops in Lab 4B HOT 1
- vague wording in Lab 0 HOT 1
- update labs with dplyr syntax HOT 1
- Update hot hands reference in probability lab
- Dns redirection problem for lab data HOT 1
- qqnormsim() not sourced?
- Phillies have 5579 at bats
- inf_for_categorical_data line 28
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from oilabs-base-r.