Giter Club home page Giter Club logo

learntidymodels's Introduction

tidymodels

R-CMD-check Codecov test coverage CRAN_Status_Badge Downloads lifecycle

Overview

tidymodels is a “meta-package” for modeling and statistical analysis that shares the underlying design philosophy, grammar, and data structures of the tidyverse.

It includes a core set of packages that are loaded on startup:

  • broom takes the messy output of built-in functions in R, such as lm, nls, or t.test, and turns them into tidy data frames.

  • dials has tools to create and manage values of tuning parameters.

  • dplyr contains a grammar for data manipulation.

  • ggplot2 implements a grammar of graphics.

  • infer is a modern approach to statistical inference.

  • parsnip is a tidy, unified interface to creating models.

  • purrr is a functional programming toolkit.

  • recipes is a general data preprocessor with a modern interface. It can create model matrices that incorporate feature engineering, imputation, and other help tools.

  • rsample has infrastructure for resampling data so that models can be assessed and empirically validated.

  • tibble has a modern re-imagining of the data frame.

  • tune contains the functions to optimize model hyper-parameters.

  • workflows has methods to combine pre-processing steps and models into a single object.

  • yardstick contains tools for evaluating models (e.g. accuracy, RMSE, etc.).

A list of all tidymodels functions across different CRAN packages can be found at https://www.tidymodels.org/find/.

You can install the released version of tidymodels from CRAN with:

install.packages("tidymodels")

Install the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/tidymodels")

When loading the package, the versions and conflicts are listed:

library(tidymodels)
#> ── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
#> ✔ broom        1.0.5      ✔ recipes      1.0.10
#> ✔ dials        1.2.1      ✔ rsample      1.2.0 
#> ✔ dplyr        1.1.4      ✔ tibble       3.2.1 
#> ✔ ggplot2      3.5.0      ✔ tidyr        1.3.1 
#> ✔ infer        1.0.6      ✔ tune         1.2.0 
#> ✔ modeldata    1.3.0      ✔ workflows    1.1.4 
#> ✔ parsnip      1.2.1      ✔ workflowsets 1.1.0 
#> ✔ purrr        1.0.2      ✔ yardstick    1.3.1
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
#> ✖ purrr::discard() masks scales::discard()
#> ✖ dplyr::filter()  masks stats::filter()
#> ✖ dplyr::lag()     masks stats::lag()
#> ✖ recipes::step()  masks stats::step()
#> • Learn how to get started at https://www.tidymodels.org/start/

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

learntidymodels's People

Contributors

andthewings avatar apreshill avatar czeildi avatar friesewoudloper avatar hfrick avatar juliasilge avatar karaesmen avatar maxdrohde avatar petzi53 avatar topepo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

learntidymodels's Issues

function to extract explained variance from a PCA recipe in tidymodels

Hi!

I just wanted to contribute with this simple function to extract explained variance

extract_explained_variace <- function(pca_recipe, n_comp = NULL) {
if (!tune::is_recipe(pca_recipe)) {
stop("Input must be a recipe.")
} else {
pca_result <- pca_recipe %>%
prep() %>%
pluck("steps", 2, "res", "sdev") %>%
tibble(sdev = .) %>%
mutate(var_expl = round((sdev^2 / sum(sdev^2)) * 100, 3), .keep = "unused") %>%
rownames_to_column("PC") %>%
mutate(PC = paste0("PC", row_number()),
PC = factor(PC, levels = unique(PC)))
if (!is.null(n_comp)) {
pca_result <- pca_result %>%
slice(1:n_comp)
}
return(pca_result)
}
}

Thanks a lot for the amazing work done with tidymodels!

Best wishes!

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

  • Help us firm up the list of targetted repositories
  • Make sure all maintainers are aware of what's coming
  • Give us an issue to close when the job is done
  • Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

Ordering categories in `plot_top_loadings()` does not work anymore

I don't believe that the munging we have here in the learntidymodels package to order bars works anymore:

library(learntidymodels)
#> Loading required package: tidyverse
#> Loading required package: tidymodels
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(tidymodels)
library(ggplot2)

data("cells", package = "modeldata")
cell_pca <-
   recipe(class ~ ., data = cells %>% dplyr::select(-case)) %>%
   step_center(all_predictors()) %>%
   step_scale(all_predictors()) %>%
   step_pca(all_predictors())
cell_pca <- prep(cell_pca)

plot_top_loadings(cell_pca, grepl("ch_1", terms) & component_number <= 4, n = 10)

Created on 2021-08-25 by the reprex package (v2.0.1)

I do have different versions of this type of functionality in tidytext, if we want to use that instead, but tidytext seems like a heavy and ridiculous dependency for this.

Error on trying to start tutorial

The problem

When I run learnr::run_tutorial("pca_recipes", package = "learntidymodels"), I get the error:

Quitting from lines 13-80 (pca_recipes.Rmd) 
Error: package or namespace load failed for 'gradethis':
 .onLoad failed in loadNamespace() for 'gradethis', details:
  call: (function (exercise.cap = "Code", exercise.eval = FALSE, exercise.timelimit = 30, 
  error: unused argument (exercise.error.check.code = "gradethis_error_checker()")

Versions:

R itself is 4.0.4
{learnr} is 0.10.1
{learntidymodels} is 0.0.0.9001

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.