Giter Club home page Giter Club logo

dataedu's Introduction

dataedu

Travis build status AppVeyor build status Codecov test coverage minimal R version Last-changedate

The goal of {dataedu} is to provide readers of Data Science in Education Using R with a package with useful functions, data, and references from the book.

Installation

1. Install {remotes}

First, let’s install {remotes}. If you already have {remotes} installed, you can move on to the next step.

install.packages("remotes")

2. Install {dataedu}

You can install the development version of {dataedu} by running this in your RStudio console:

remotes::install_github("data-edu/dataedu")

Important Notes on Installation

  • {dataedu} requires R 3.6 or above to be installed.

  • {dataedu} has other packages that it needs to be able to run. You can see the full list under “Imports” (imported when downloading the package) and “Suggests” (we think you should include these too!) in the DESCRIPTION file.

  • We recommend first checking to see if your packages are all up-to-date if you are running into issues with installation. If you have installed the imported/suggested packages previously and have not updated them in a while, RStudio may prompt you to update them. You can choose to (1) ignore this prompt, (2) exit the prompt and update your packages, or (3) try to update your packages through the prompt. It’s usually easier to exit and update your packages outside the prompt (one way to do this is to go to the RStudio Packages pane and click Update, then select the packages you’d like to update).

3. Call the Package

Before you can use the package, make sure to call it using library():

library(dataedu)

Package Contents

We created this package to provide our readers an opportunity to jump into R however they see fit.

  1. Mass installation of all the packages used in the book
  2. Reproducible code for the walkthroughs
  3. Access to the data used in each of the walkthroughs
  4. The dataedu theme and color palette for reuse

Mass Installation of Packages

We strived to use packages that we use in our daily work when creating the walkthroughs in the book. Because we covered a variety of subjects, that means we used a lot of packages! As described in the Foundational Skills chapter, you can install the packages individually as they suit your needs.

However, if you want to get started quickly and download all the packages at once, please use install_dataedu().

dataedu::install_dataedu()

To see the packages used in the book, run:

dataedu::dataedu_packages
#>  [1] "apaTables"   "caret"       "dummies"     "e1071"       "ggraph"     
#>  [6] "here"        "janitor"     "lme4"        "lubridate"   "performance"
#> [11] "ranger"      "readxl"      "rtweet"      "randomNames" "sjPlot"     
#> [16] "textdata"    "tidygraph"   "tidylog"     "tidyverse"   "tidytext"

A special note on {tabulizer}: One of the walkthroughs uses tabulizer, created by ROpenSci to read PDFs. {tabulizer} requires the installation of RJava, which can be a tricky process. {tabulizer} is not included in install_dataedu() and we recommend reading through the notes on its Github repo if installing.

Reproducible Code for Walkthroughs

Coming soon!

Accessing the Walkthrough Data

To get the data, run dataedu:: then the dataset as it is named in the book:

dataedu::course_data

To see all the datasets available in the package, run data(package = "dataedu").

# this is to print the results for the README
# only `data(package = "dataedu")` is needed to see this list
a <- data(package = "dataedu")
a$result[ , 3:4]
#>       Item                                 
#>  [1,] "all_files"                          
#>  [2,] "bchildcountandedenvironments2012"   
#>  [3,] "bchildcountandedenvironments2013"   
#>  [4,] "bchildcountandedenvironments2014"   
#>  [5,] "bchildcountandedenvironments2015"   
#>  [6,] "bchildcountandedenvironments2016"   
#>  [7,] "bchildcountandedenvironments2017_18"
#>  [8,] "child_counts"                       
#>  [9,] "course_data"                        
#> [10,] "course_minutes"                     
#> [11,] "district_merged_df"                 
#> [12,] "district_tidy_df"                   
#> [13,] "frpl_pdf"                           
#> [14,] "ma_data_init"                       
#> [15,] "pre_survey"                         
#> [16,] "race_pdf"                           
#> [17,] "sci_mo_processed"                   
#> [18,] "sci_mo_with_text"                   
#> [19,] "tt_tweets"                          
#>       Title                                                                     
#>  [1,] "Walkthrough 04 - Students with Disabilities Counts - Combined List"      
#>  [2,] "Walkthrough 04 - Students with Disabilities Counts - 2012"               
#>  [3,] "Walkthrough 04 - Students with Disabilities Counts - 2013"               
#>  [4,] "Walkthrough 04 - Students with Disabilities Counts - 2014"               
#>  [5,] "Walkthrough 04 - Students with Disabilities Counts - 2015"               
#>  [6,] "Walkthrough 04 - Students with Disabilities Counts - 2016"               
#>  [7,] "Walkthrough 04 - Students with Disabilities Counts - 2017-18"            
#>  [8,] "Walkthrough 04 - Students with Disabilities Counts - Combined Data Frame"
#>  [9,] "Walkthrough 01 - Course Data"                                            
#> [10,] "Walkthrough 01 - Course Minutes"                                         
#> [11,] "Walkthrough 03 - Merged Ethnicity and FRPL District Data"                
#> [12,] "Walkthrough 03 - Merged and Tidy Ethnicity and FRPL District Data"       
#> [13,] "Walkthrough 03 - Tabulizer Output from FRPL PDF"                         
#> [14,] "Foundational Skills Data"                                                
#> [15,] "Walkthrough 01 - Pre-Survey"                                             
#> [16,] "Walkthrough 03 - Tabulizer Output from Race PDF"                         
#> [17,] "Walkthrough 01 - Student Motivation (Processed)"                         
#> [18,] "Walkthrough 01 - Student Motivation (Processed and With Text)"           
#> [19,] "Walkthrough 12 - Tweet Data"

If you would like to download the data in non-.Rds (RData) format, the CSV and JSON formats are available under inst/extdata. Please note that all_files is not included because of how large the file would be.

Using the {dataedu} Theme and Palette

Add the theme and palette to ggplot2-based plots using theme_dataedu() and scale_*_dataedu().

  • Note: The DataEdu theme uses {showtext} to render the font. If you would like to use it in an R markdown chunk, please ensure that the chunk lists fig.showtext = TRUE. If you would like to use it in a standalone R script, then you will need to use a differnet graphic device. More information is available in the documentation here.
library(ggplot2)
library(dataedu)

ggplot(midwest, aes(x = area, y = popdensity, color = state)) +
  geom_point() +
  theme_dataedu() +
  scale_color_dataedu()

The font for the DSIEUR graphs is Cabin and available here. The code to load the font with the package is heavily based on the code from Guangchuang Yu’s extrafont package - thank you!

Contact

dataedu's People

Contributors

efreer20 avatar gvelasq avatar ivelasq avatar joshuarosenberg avatar jrosen48 avatar kierisi avatar restrellado avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataedu's Issues

error in instalation using ... devtools::install_github("data-edu/dataedu")

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.

Brief description of the problem

> devtools::install_github("data-edu/dataedu")
Downloading GitHub repo data-edu/dataedu@master
sh: 1: /bin/gtar: not found
sh: 1: /bin/gtar: not found
Error: Failed to install 'dataedu' from GitHub:
  error in running command
In addition: Warning messages:
1: In system(cmd) : error in running command
2: In utils::untar(tarfile, ...) :/bin/gtar -xf '/tmp/RtmpCrd0sR/file33f81e7ec26a.tar.gz' -C '/tmp/RtmpCrd0sR/remotes33f8291625ea'returned error code 127

please solve this

Add the textdata package?

Should we add textdata package to the list of packages install with the mass installation function? It is used in chapter 11 of the DSIEUR book.

folder with grades missing

Walkthrough 2: There's no folder gradebooks in 'data' nor anywhere else. Hence, there are not files with grades.

Gill Sans not loading on Macs

Referencing post: data-edu/data-science-in-education#519

Gill Sans MT is a font that is bundled only on some Macs, and {extrafont} doesn't necessarily load it unless it's already on the user's computer. To be able to use theme_dataedu(), some Mac users have to download Gill Sans MT, use extrafont::font_import(), then loadfonts(). Not sure if this is also the case with Windows users.

Need to decide how to fix this in the package, either by directing users to download the font, throwing an informative warning if the font is not installed on their computers and letting users render the plot with the default font, or changing Gill Sans to a font that doesn't have these issues.

Installation throws an error if you don't have `simstudy` already installed

When I used devtools::install_github("data-edu/dataedu") I got this error:

Error in library(simstudy) : there is no package called ‘simstudy’
Error: unable to load R code in package ‘dataedu’
Execution halted

I installed simstudy and that seemed to fix the problem. Is there a way we can require simstudy when installing the package?

errors on sample codes

Code on page 50 for roster does not execute and sample codes on page 57 either. Error suggest group_by is not applicable to character var District Name. Frustrating when learning R. I thought you people have tested the sample codes. The whole idea of the book seems very practical.

minor - add date to DESCRIPTION?

just noticed this warning when generating citaitons for the packages used in the book:

knitr::write_bib(c("bookdown", "tidyverse", "dplyr", "tidyr", 
                 "ggplot2", "sjPlot", "lme4", "ggraph", "tidygraph", 
                 "caret", "readxl", "here", "lubridate", "dummies",
                 "janitor", "dataedu", "tidyverse", dataedu:::all_packages), "packages.bib")
#> Warning in citation(pkg, auto = if (pkg == "base") NULL else TRUE): no date
#> field in DESCRIPTION file of package 'dataedu'

Created on 2020-03-01 by the reprex package (v0.3.0)

R CMD check issues

Starting from the latest Travis build #35, there are several remaining items to address in R CMD check:

  • Warning: Several LaTeX errors with .Rd files ending in \end{description}, possibly because @description is meant to be the second paragraph of each documentation block. Lines:
  • Warning: Several LaTeX errors for 'Overfull \hbox'. Lines:
  • Error: Error in texi2dvi, unable to run ‘pdflatex’ on ‘Rd2.tex’
    • L7319 says ‘Error in running tools::texi2pdf(). Since L699 doesn’t list {tools} as an attached base package and it’s not listed on L825 as one of the 94 installed packages, we might need to add {tools} to .travis.yml.
  • Warning: Non-standard license specification, CCBY-4.0
  • Note: No visible global function definitions for several functions from L8360 to L8375
  • Note: Undefined global functions or variables from L8376 to L8378
  • Warning: Missing documentation entry for wt05_aggdat_merged_dat on L8386
  • Warning: Code/documentation mismatch for tt_tweets on L8394
  • Warning: ‘processed’ and ‘raw’ folders in the ‘data’ directory
    • Travis suggests moving these to ‘inst/extdata’

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.