Giter Club home page Giter Club logo

mlr3viz's Introduction

mlr3viz

Package website: release | dev

r-cmd-check CRAN StackOverflow Mattermost

mlr3viz is the visualization package of the mlr3 ecosystem. It features plots for mlr3 objects such as tasks, learners, predictions, benchmark results, tuning instances and filters via the autoplot() generic of ggplot2. The package draws plots with the viridis color palette and applies the minimal theme. Visualizations include barplots, boxplots, histograms, ROC curves, and Precision-Recall curves.

The gallery features a showcase post of the plots in mlr3viz.

Installation

Install the last release from CRAN:

install.packages("mlr3")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3viz")

Resources

The gallery features a showcase post of the visualization functions mlr3viz.

Short Demo

library(mlr3)
library(mlr3viz)

task = tsk("pima")
learner = lrn("classif.rpart", predict_type = "prob")
rr = resample(task, learner, rsmp("cv", folds = 3), store_models = TRUE)

# Default plot for task
autoplot(task, type = "target")

# ROC curve for resample result
autoplot(rr, type = "roc")

For more example plots you can have a look at the pkgdown references of the respective functions.

mlr3viz's People

Contributors

be-marc avatar ck37 avatar damirpolat avatar github-actions[bot] avatar giuseppec avatar henrifnk avatar jakob-r avatar larskotthoff avatar lorenzwalthert avatar mb706 avatar mllg avatar pat-s avatar pre-commit-ci[bot] avatar raphaels1 avatar sebffischer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlr3viz's Issues

`autoplot()` for cluster learner fails

Description

Source: "Introducing mlr3cluster: Cluster Analysis Package" gallery post

@damirpolat Would you mind having a look?

Reproducible example

library(mlr3)
library(mlr3cluster)
library(mlr3viz)

task <- mlr_tasks$get("usarrests")
learner <- mlr_learners$get("clust.agnes")
learner$train(task)

# Simple dendrogram
# autoplot(learner)

# More advanced options from `factoextra::fviz_dend`
autoplot(learner,
  k = learner$param_set$values$k, rect_fill = TRUE,
  rect = TRUE, rect_border = c("red", "cyan")
)
#> Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
#> "none")` instead.
#> Error in if (color == "cluster") color <- "default": the condition has length > 1

Created on 2022-07-09 by the reprex package (v2.0.1)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Monterey 12.4
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Zurich
#>  date     2022-07-09
#>  pandoc   2.18 @ /opt/homebrew/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package        * version     date (UTC) lib source
#>  abind            1.4-5       2016-07-21 [1] CRAN (R 4.2.0)
#>  assertthat       0.2.1       2019-03-21 [1] CRAN (R 4.2.0)
#>  backports        1.4.1       2021-12-13 [1] CRAN (R 4.2.0)
#>  broom            1.0.0       2022-07-01 [1] CRAN (R 4.2.0)
#>  car              3.1-0       2022-06-15 [1] CRAN (R 4.2.0)
#>  carData          3.0-5       2022-01-06 [1] CRAN (R 4.2.0)
#>  checkmate        2.1.0       2022-04-21 [1] CRAN (R 4.2.0)
#>  cli              3.3.0       2022-04-25 [1] CRAN (R 4.2.0)
#>  clue             0.3-61      2022-05-30 [1] CRAN (R 4.2.0)
#>  cluster          2.1.3       2022-03-28 [1] CRAN (R 4.2.0)
#>  clusterCrit      1.2.8       2018-07-26 [1] CRAN (R 4.2.0)
#>  codetools        0.2-18      2020-11-04 [3] CRAN (R 4.2.1)
#>  colorspace       2.0-3       2022-02-21 [1] CRAN (R 4.2.0)
#>  crayon           1.5.1       2022-03-26 [1] CRAN (R 4.2.0)
#>  data.table       1.14.2      2021-09-27 [1] CRAN (R 4.2.0)
#>  DBI              1.1.3       2022-06-18 [1] CRAN (R 4.2.0)
#>  dendextend       1.16.0      2022-07-04 [1] CRAN (R 4.2.0)
#>  digest           0.6.29      2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr            1.0.9       2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis         0.3.2       2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate         0.15        2022-02-18 [1] CRAN (R 4.2.0)
#>  factoextra       1.0.7       2020-04-01 [1] CRAN (R 4.2.0)
#>  fansi            1.0.3       2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap          1.1.0       2021-01-25 [1] CRAN (R 4.2.0)
#>  fs               1.5.2       2021-12-08 [1] CRAN (R 4.2.0)
#>  future           1.26.1      2022-05-27 [1] CRAN (R 4.2.0)
#>  generics         0.1.3       2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2          3.3.6       2022-05-03 [1] CRAN (R 4.2.0)
#>  ggpubr           0.4.0       2020-06-27 [1] CRAN (R 4.2.0)
#>  ggrepel          0.9.1       2021-01-15 [1] CRAN (R 4.2.0)
#>  ggsignif         0.6.3       2021-09-09 [1] CRAN (R 4.2.0)
#>  globals          0.15.1      2022-06-24 [1] CRAN (R 4.2.0)
#>  glue             1.6.2       2022-02-24 [1] CRAN (R 4.2.0)
#>  gridExtra        2.3         2017-09-09 [1] CRAN (R 4.2.0)
#>  gtable           0.3.0       2019-03-25 [1] CRAN (R 4.2.0)
#>  highr            0.9         2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools        0.5.2       2021-08-25 [1] CRAN (R 4.2.0)
#>  knitr            1.39        2022-04-26 [1] CRAN (R 4.2.0)
#>  lgr              0.4.3       2021-09-16 [1] CRAN (R 4.2.0)
#>  lifecycle        1.0.1       2021-09-24 [1] CRAN (R 4.2.0)
#>  listenv          0.8.0       2019-12-05 [1] CRAN (R 4.2.0)
#>  magrittr         2.0.3       2022-03-30 [1] CRAN (R 4.2.0)
#>  mlr3           * 0.13.3-9000 2022-07-06 [1] Github (mlr-org/mlr3@f753fd9)
#>  mlr3cluster    * 0.1.3       2022-07-09 [1] Github (mlr-org/mlr3cluster@67eed02)
#>  mlr3misc         0.10.0      2022-01-11 [1] CRAN (R 4.2.0)
#>  mlr3viz        * 0.5.9       2022-05-30 [1] local
#>  munsell          0.5.0       2018-06-12 [1] CRAN (R 4.2.0)
#>  palmerpenguins   0.1.0       2020-07-23 [1] CRAN (R 4.2.0)
#>  paradox          0.9.0.9000  2022-07-06 [1] Github (mlr-org/paradox@3ce3c77)
#>  parallelly       1.32.0      2022-06-07 [1] CRAN (R 4.2.0)
#>  pillar           1.7.0       2022-02-01 [1] CRAN (R 4.2.0)
#>  pkgconfig        2.0.3       2019-09-22 [1] CRAN (R 4.2.0)
#>  purrr            0.3.4       2020-04-17 [1] CRAN (R 4.2.0)
#>  R.cache          0.15.0      2021-04-30 [1] CRAN (R 4.2.0)
#>  R.methodsS3      1.8.2       2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo             1.25.0      2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils          2.12.0      2022-06-28 [1] CRAN (R 4.2.0)
#>  R6               2.5.1       2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp             1.0.8.3     2022-03-17 [1] CRAN (R 4.2.0)
#>  rematch2         2.1.2       2020-05-01 [1] CRAN (R 4.2.0)
#>  reprex           2.0.1       2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang            1.0.3       2022-06-27 [1] CRAN (R 4.2.0)
#>  rmarkdown        2.14        2022-04-25 [1] CRAN (R 4.2.0)
#>  rstatix          0.7.0       2021-02-13 [1] CRAN (R 4.2.0)
#>  scales           1.2.0       2022-04-13 [1] CRAN (R 4.2.0)
#>  sessioninfo      1.2.2       2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi          1.7.6       2021-11-29 [1] CRAN (R 4.2.0)
#>  stringr          1.4.0       2019-02-10 [1] CRAN (R 4.2.0)
#>  styler           1.7.0       2022-03-13 [1] CRAN (R 4.2.0)
#>  tibble           3.1.7       2022-05-03 [1] CRAN (R 4.2.0)
#>  tidyr            1.2.0       2022-02-01 [1] CRAN (R 4.2.0)
#>  tidyselect       1.1.2       2022-02-21 [1] CRAN (R 4.2.0)
#>  utf8             1.2.2       2021-07-24 [1] CRAN (R 4.2.0)
#>  uuid             1.1-0       2022-04-19 [1] CRAN (R 4.2.0)
#>  vctrs            0.4.1       2022-04-13 [1] CRAN (R 4.2.0)
#>  viridis          0.6.2       2021-10-13 [1] CRAN (R 4.2.0)
#>  viridisLite      0.4.0       2021-04-13 [1] CRAN (R 4.2.0)
#>  withr            2.5.0       2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun             0.31        2022-05-10 [1] CRAN (R 4.2.0)
#>  yaml             2.3.5       2022-02-21 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Users/pjs/Library/R/arm64/4.2/library
#>  [2] /opt/R/4.2.1-arm64/Resources/site-library
#>  [3] /opt/R/4.2.1-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

plot for feature importance

Would it be possible to add a plot to visualize feature importances?

I have created such a function here, that takes the named feature importance vector and returns a barplot.

precision-recall (and ROC) curve shows different values than the score function

When plotting the precision-recall curve (same problem with the ROC curve), the displayed values differ from the performance estimates computed with the score() function.
Is this a bug or did I miss any obvious mistake on my side?

library(mlr3verse)
library(ggplot2)
set.seed(3)

task = tsk("german_credit")
learner = lrn("classif.log_reg", predict_type = "prob")
rr = resample(task, learner, rsmp("holdout"))
p = rr$prediction()

p$set_threshold(0.5)
s = p$score(measures = list(msr("classif.precision"), msr("classif.recall")))

autoplot(p, type = "prc") + 
  geom_segment(aes(x = 0, y = s[1], xend = s[2], yend = s[1])) +
  geom_segment(aes(x = s[2], y = 0, xend = s[2], yend = s[1]))

image

> s
classif.precision    classif.recall 
        0.7984496         0.8841202 
> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] mlr3_0.11.0     ggplot2_3.3.3   mlr3verse_0.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6           paradox_0.7.1        lattice_0.20-41     
 [4] listenv_0.8.0        palmerpenguins_0.1.0 assertthat_0.2.1    
 [7] digest_0.6.27        utf8_1.2.1           parallelly_1.24.0   
[10] R6_2.5.0             mlr3measures_0.3.1   backports_1.2.1     
[13] evaluate_0.14        pillar_1.5.1         rlang_0.4.10        
[16] mlr3fselect_0.5.1    uuid_0.1-4           rstudioapi_0.13     
[19] data.table_1.14.0    distr6_1.5.0         Matrix_1.3-2        
[22] checkmate_2.0.0      rmarkdown_2.7        labeling_0.4.2      
[25] splines_4.0.4        mlr3pipelines_0.3.4  munsell_0.5.0       
[28] compiler_4.0.4       set6_0.2.1           xfun_0.22           
[31] pkgconfig_2.0.3      globals_0.14.0       mlr3tuning_0.8.0    
[34] htmltools_0.5.1.1    tidyselect_1.1.0     tibble_3.1.0        
[37] mlr3data_0.3.1       lgr_0.4.2            mlr3cluster_0.1.1   
[40] mlr3misc_0.8.0       codetools_0.2-18     clusterCrit_1.2.8   
[43] fansi_0.4.2          future_1.21.0        crayon_1.4.1        
[46] dplyr_1.0.5          withr_2.4.1          grid_4.0.4          
[49] gtable_0.3.0         lifecycle_1.0.0      magrittr_2.0.1      
[52] scales_1.1.1         mlr3learners_0.4.5   mlr3proba_0.3.2     
[55] future.apply_1.7.0   farver_2.1.0         mlr3viz_0.5.3       
[58] renv_0.13.1          mlr3filters_0.4.1    ellipsis_0.3.1      
[61] bbotk_0.3.2          vctrs_0.3.6          generics_0.1.0      
[64] tools_4.0.4          glue_1.4.2           purrr_0.3.4         
[67] parallel_4.0.4       survival_3.2-7       yaml_2.2.1          
[70] clue_0.3-58          colorspace_2.0-0     cluster_2.1.0       
[73] R62S3_1.4.1          knitr_1.31           precrec_0.12.1      

Problem with plot_learner_prediction: "Objekt 'response' nicht gefunden"

We are currently in the i2ml class and some students run into an error when running this code. On my machine it works. We were not able to figure out what the reason is.

Please help 😬 🆘

library(mlbench)
#> Warning: Paket 'mlbench' wurde unter R Version 3.6.3 erstellt
library(mlr3)
library(mlr3learners)
library(mlr3viz)
#> Warning: Paket 'mlr3viz' wurde unter R Version 3.6.3 erstellt
library(ggplot2)
library(grid)

set.seed(123L)

spirals <- data.frame(mlbench.spirals(n = 500, sd = 0.1))

spirals_task <- TaskClassif$new(id = "spirals", backend = spirals, target = "classes")

lda_learner <-  lrn("classif.lda", predict_type = "prob")

plot_learner_prediction(lda_learner, spirals_task)
#> INFO  [14:36:05.013] Applying learner 'classif.lda' on task 'spirals' (iter 1/1)
#> Error in eval(bysub, x, parent.frame()): Objekt 'response' nicht gefunden

Created on 2020-03-17 by the reprex package (v0.3.0)

Session info
sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
#> [5] LC_TIME=German_Germany.1252    
#> 
#> attached base packages:
#> [1] grid      stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] ggplot2_3.2.1           mlr3viz_0.1.1           mlr3learners_0.1.6-9000
#> [4] mlr3_0.1.8-9000         mlbench_2.1-1          
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.2         compiler_3.6.1     pillar_1.4.2      
#>  [4] highr_0.8          mlr3misc_0.1.8     tools_3.6.1       
#>  [7] digest_0.6.22      uuid_0.1-4         evaluate_0.14     
#> [10] tibble_2.1.3       checkmate_1.9.4    gtable_0.3.0      
#> [13] pkgconfig_2.0.3    rlang_0.4.1        yaml_2.2.0        
#> [16] xfun_0.10          withr_2.1.2        stringr_1.4.0     
#> [19] dplyr_0.8.3        knitr_1.25         tidyselect_0.2.5  
#> [22] glue_1.3.1         data.table_1.12.6  R6_2.4.0          
#> [25] rmarkdown_1.17     purrr_0.3.3        lgr_0.3.3         
#> [28] magrittr_1.5       mlr3measures_0.1.2 MASS_7.3-51.4     
#> [31] backports_1.1.5    scales_1.0.0       htmltools_0.4.0   
#> [34] assertthat_0.2.1   colorspace_1.4-1   paradox_0.1.0     
#> [37] stringi_1.4.3      lazyeval_0.2.2     munsell_0.5.0     
#> [40] crayon_1.3.4

ROC curves under diagonal with benchmark mlr3

autoplot function return somethings wrong. Eg. roc curve _under diagonal _

library(mlr3verse)
library(mlr3viz)
task = tsk("german_credit")
design = benchmark_grid(
  tasks = task,
  learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"),
                  predict_type = "prob", predict_sets = c("train", "test")),
  resamplings = rsmps("cv", folds = 3)
)

### Roc curve under diagonal ----------------------------
bmr = benchmark(design)
autoplot(bmr, type = "roc")

### Roc curve above diagonal ----------------------------
learner = lrn("classif.rpart", predict_type = "prob")
pred = learner$train(task)$predict(task)
autoplot(pred, type = "roc")
```

### May be error at this
```{r}
as_precrec.BenchmarkResult = function(object) { 
    posclass = levels(data$labels)[1L] # return NULL ----------------------------
  )
}

Plot benchmark result and identical learner with different hyperpars

This came up today and was confusing for a few people:

library(mlr3)
library(mlr3viz)
lrn_list = list(
  lrn("classif.log_reg", predict_type = "prob"),
  lrn("classif.kknn", predict_type = "prob", k = 3),
  lrn("classif.kknn", predict_type = "prob", k = 10)
)

bm_design = benchmark_grid(task = tsk("sonar"), resamplings = rsmp("cv", folds = 10), learners = lrn_list)
bench_result = benchmark(bm_design)
autoplot(bench_result)

Both knn learners are grouped together. I know that this can be fixed by setting ids but this is still easy to overlook as there are now 2 different learners combined in a single boxplot.

Imo. There should be 3 boxplots with two of them having the same name. (maybe do the faceting by the learner hash and overwrite the labels with the learner id afterward?)

Documentation on autoplot.PredictionRegr

Another question from the I2ML students:

In the plot we see a blue and grey line which do not necessarily overlap. The documentation does not say which line is which (nor does the book). Can you add this info and let us know?
tmps

Rename to mlr3vis?

Because "vizualization" is not a valid word.
This also applies to DESCRIPTION.

Autplot for Benchmarking (Change order of boxplots/tasks)

Hi!

Plotting my benchmark results, I would like to change the order of the boxplots from alphabetical order (default in autoplot function?) to another order.
So far, I was able to change the respective task's boxplots the following way (see code).
However, since recently, changing the levels of the task_id-factor variable, only changes the order of the title, but not the respective boxplots themselves.

How can I change the order of the boxplots?
Any help highly appreciated! Thanks a lot


library(mlr3)
library(mlr3viz)

tasks = tsks(c("pima", "sonar"))
learner = lrns(c("classif.featureless", "classif.rpart"),
               predict_type = "prob")
resampling = rsmps("cv")
bmr = benchmark(benchmark_grid(tasks, learner, resampling))

# AUC
bmr_boxplots_auc <- autoplot(bmr, measure = msr("classif.auc")) + 
  ggplot2::theme(axis.text.x = ggplot2::element_text(hjust = .5)) + 
  ggplot2::scale_x_discrete(labels =c("FL", "RF", "LR"))+
  ggplot2::ylab("AUC")+
  ggplot2::xlab("Learner")+
  ggplot2::theme_bw()+
  ggplot2::theme(panel.grid = element_blank(),
                 strip.background = element_blank()
  )
bmr_boxplots_auc
bmr_boxplots_auc$data$task_id <- as.factor(bmr_boxplots_auc$data$task_id) 
levels(bmr_boxplots_auc$data$task_id) = c("sonar", "pima")
bmr_boxplots_auc

Bildschirmfoto 2021-11-24 um 18 00 22

Inverse ROC

Thanks for a really great and useful package!

It seems that something might have changed in the way that the ROC curve is plotted using the autoplot() function - it's currently inverse of what you would expect whenever I use autoplot(rr, type = "roc").

If you take a look at the mlr3 book it also seems that it is happening here https://mlr3book.mlr-org.com/benchmarking.html

Not working with R 3.6.1

Error message from install.packages():

package ‘mlr3viz’ is not available (for R version 3.6.1)

Incorrect ordering of x-labels in autoplot.BenchmarkResult

library(mlr3learners)
library(mlr3viz)
library(mlr3)
task = tsk("boston_housing")

learners = lrns(c("regr.featureless", "regr.rpart", "regr.ranger"))
measure = msr("regr.rmse")
set.seed(1)
bmr = benchmark(benchmark_grid(task, learners, rsmp("cv", folds = 3)))
bmr$score(measure)[,c("learner_id", "regr.rmse")]
#>          learner_id regr.rmse
#> 1: regr.featureless  8.455316
#> 2: regr.featureless 10.222279
#> 3: regr.featureless  8.974639
#> 4:       regr.rpart  1.729365
#> 5:       regr.rpart  1.811437
#> 6:       regr.rpart  1.608334
#> 7:      regr.ranger  1.636241
#> 8:      regr.ranger  2.125320
#> 9:      regr.ranger  1.901388
mlr3viz:::autoplot.BenchmarkResult(bmr, measure = measure)

Created on 2020-10-01 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.3      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-10-01                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date       lib source                               
#>  assertthat     0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                       
#>  backports      1.1.9      2020-08-24 [1] CRAN (R 4.0.2)                       
#>  callr          3.4.4      2020-09-07 [1] CRAN (R 4.0.2)                       
#>  checkmate      2.0.0      2020-02-06 [1] CRAN (R 4.0.0)                       
#>  cli            2.0.2      2020-02-28 [1] CRAN (R 4.0.0)                       
#>  codetools      0.2-16     2018-12-24 [1] CRAN (R 4.0.2)                       
#>  colorspace     1.4-1      2019-03-18 [1] CRAN (R 4.0.0)                       
#>  crayon         1.3.4      2017-09-16 [1] CRAN (R 4.0.0)                       
#>  curl           4.3        2019-12-02 [1] CRAN (R 4.0.0)                       
#>  data.table     1.13.0     2020-07-24 [1] CRAN (R 4.0.2)                       
#>  desc           1.2.0      2018-05-01 [1] CRAN (R 4.0.0)                       
#>  devtools       2.3.0      2020-04-10 [1] CRAN (R 4.0.0)                       
#>  digest         0.6.25     2020-02-23 [1] CRAN (R 4.0.0)                       
#>  dplyr          1.0.2      2020-08-18 [1] CRAN (R 4.0.2)                       
#>  ellipsis       0.3.1      2020-05-15 [1] CRAN (R 4.0.2)                       
#>  evaluate       0.14       2019-05-28 [1] CRAN (R 4.0.0)                       
#>  fansi          0.4.1      2020-01-08 [1] CRAN (R 4.0.0)                       
#>  farver         2.0.3      2020-01-16 [1] CRAN (R 4.0.0)                       
#>  fs             1.4.1      2020-04-04 [1] CRAN (R 4.0.0)                       
#>  future         1.18.0     2020-07-09 [1] CRAN (R 4.0.2)                       
#>  future.apply   1.6.0      2020-07-01 [1] CRAN (R 4.0.2)                       
#>  generics       0.0.2      2018-11-29 [1] CRAN (R 4.0.0)                       
#>  ggplot2        3.3.2      2020-06-19 [1] CRAN (R 4.0.2)                       
#>  globals        0.12.5     2019-12-07 [1] CRAN (R 4.0.0)                       
#>  glue           1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                       
#>  gtable         0.3.0      2019-03-25 [1] CRAN (R 4.0.0)                       
#>  highr          0.8        2019-03-20 [1] CRAN (R 4.0.0)                       
#>  htmltools      0.4.0      2019-10-04 [1] CRAN (R 4.0.0)                       
#>  httr           1.4.1      2019-08-05 [1] CRAN (R 4.0.0)                       
#>  knitr          1.29       2020-06-23 [1] CRAN (R 4.0.2)                       
#>  labeling       0.3        2014-08-23 [1] CRAN (R 4.0.0)                       
#>  lattice        0.20-41    2020-04-02 [1] CRAN (R 4.0.2)                       
#>  lgr            0.3.4      2020-03-20 [1] CRAN (R 4.0.0)                       
#>  lifecycle      0.2.0      2020-03-06 [1] CRAN (R 4.0.0)                       
#>  listenv        0.8.0      2019-12-05 [1] CRAN (R 4.0.0)                       
#>  magrittr       1.5        2014-11-22 [1] CRAN (R 4.0.0)                       
#>  Matrix         1.2-18     2019-11-27 [1] CRAN (R 4.0.2)                       
#>  memoise        1.1.0      2017-04-21 [1] CRAN (R 4.0.0)                       
#>  mime           0.9        2020-02-04 [1] CRAN (R 4.0.0)                       
#>  mlr3         * 0.6.0-9000 2020-09-14 [1] Github (mlr-org/mlr3@f7fb636)        
#>  mlr3learners * 0.3.0      2020-09-12 [1] Github (mlr-org/mlr3learners@50b4169)
#>  mlr3measures   0.2.0      2020-06-27 [1] CRAN (R 4.0.2)                       
#>  mlr3misc       0.5.0      2020-08-13 [1] CRAN (R 4.0.2)                       
#>  mlr3viz      * 0.2.0-9000 2020-09-10 [1] Github (mlr-org/mlr3viz@8d1f98c)     
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.0.0)                       
#>  paradox        0.4.0-9000 2020-09-09 [1] Github (mlr-org/paradox@e10d740)     
#>  pillar         1.4.6      2020-07-10 [1] CRAN (R 4.0.2)                       
#>  pkgbuild       1.1.0      2020-07-13 [1] CRAN (R 4.0.2)                       
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.0.0)                       
#>  pkgload        1.1.0      2020-05-29 [1] CRAN (R 4.0.2)                       
#>  prettyunits    1.1.1      2020-01-24 [1] CRAN (R 4.0.0)                       
#>  processx       3.4.4      2020-09-03 [1] CRAN (R 4.0.2)                       
#>  ps             1.3.4      2020-08-11 [1] CRAN (R 4.0.2)                       
#>  purrr          0.3.4      2020-04-17 [1] CRAN (R 4.0.0)                       
#>  R6             2.4.1      2019-11-12 [1] CRAN (R 4.0.0)                       
#>  ranger         0.12.1     2020-01-10 [1] CRAN (R 4.0.0)                       
#>  Rcpp           1.0.5      2020-07-06 [1] CRAN (R 4.0.0)                       
#>  remotes        2.1.1      2020-02-15 [1] CRAN (R 4.0.0)                       
#>  rlang          0.4.7      2020-07-09 [1] CRAN (R 4.0.2)                       
#>  rmarkdown      2.3        2020-06-18 [1] CRAN (R 4.0.2)                       
#>  rpart          4.1-15     2019-04-12 [1] CRAN (R 4.0.2)                       
#>  rprojroot      1.3-2      2018-01-03 [1] CRAN (R 4.0.0)                       
#>  scales         1.1.1      2020-05-11 [1] CRAN (R 4.0.2)                       
#>  sessioninfo    1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                       
#>  stringi        1.4.6      2020-02-17 [1] CRAN (R 4.0.0)                       
#>  stringr        1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                       
#>  testthat       2.3.2      2020-03-02 [1] CRAN (R 4.0.0)                       
#>  tibble         3.0.3      2020-07-10 [1] CRAN (R 4.0.2)                       
#>  tidyselect     1.1.0      2020-05-11 [1] CRAN (R 4.0.2)                       
#>  usethis        1.6.0      2020-04-09 [1] CRAN (R 4.0.0)                       
#>  uuid           0.1-4      2020-02-26 [1] CRAN (R 4.0.0)                       
#>  vctrs          0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                       
#>  withr          2.2.0      2020-04-20 [1] CRAN (R 4.0.0)                       
#>  xfun           0.16       2020-07-24 [1] CRAN (R 4.0.2)                       
#>  xml2           1.3.1      2020-04-09 [1] CRAN (R 4.0.0)                       
#>  yaml           2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                       
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Plot learner prediction broken

This is supposed to work, right?

library("mlr3")
library("mlr3viz")
gen = tgen("2dnormals")
gen$plot()

task = gen$generate(100)
plot_learner_prediction(lrn("classif.rpart"), task)

Move ggplot2 to Depends?

Currently I need to load _ggplot2: before autoplot() will be available.

I assume that all functions will use ggplot2 in the background anyways? Having it in Depends would trigger an auto-load when loading mlr3viz.

Threshold plot type in autoplot

Hello,

The threshold plot type in autoplot prints an error Error: Unknown plot type 'threshold'. I was searching for a way to plot a LIFT curve and came accros this issue.

library(mlr3)
library(mlr3viz)

task = tsk("spam")
learner = lrn("classif.rpart", predict_type = "prob")
object = learner$train(task)$predict(task)

head(fortify(object))
autoplot(object)
autoplot(object, type = "roc")
autoplot(object, type = "prc")
autoplot(object, type = "threshold") # Error

Best regards,
Mathieu

LIFT curve

Hello,

I was wondering if there is a way to plot the LIFT curve associated with a binary classification model. It is possible to plot ROC curves using autoplot() but I could not find a way to plot LIFT curves.

Best regards,
Mathieu

plot function for learners

each learner should allow a simple plot function in its API
this should be an s3 method plot.class in mlr3viz
the function displays a "standard view" of the model, in ggplot or base R plot or ....
if some creates a local learner or a learner in the mlr3 learners online repo, this function
can also be defined locally / there

plotLearnerPrediction

We've used in our intro2ml course a lot plotLearnerPrediction() from mlr to visualize how certain classifier perform and how their decision boundaries differ, but there is no equivalent feature in mlr3.

bmr plot broken

library(mlr3)
library(mlr3viz)

# benchmarking with benchmark_grid()
design = benchmark_grid(tsk("iris"), lrn("classif.rpart"), rsmp("cv", folds = 3))
bmr = benchmark(design)

design2 = benchmark_grid(tsk("sonar"), lrn("classif.featureless"), rsmp("cv", folds = 3))
bmr2 = benchmark(design2)

bmr$combine(bmr2)
plot(bmr)

image

This also happens in more realistic scenarios.

I tried to fix it here:

learner_labels = unique(tab$learner_id)

by adding names(learner_labels) = unique(tab$nr) whihc would work for the given example but the test fails.
The test fail indicates that there is no bijective mapping between tab$nr and learner_labels. If there is no such mapping it points to a further problem that the labels cannot be correct.

ROC curve for `PredictionClassif` objects switches the classes.

Hi there,

consider the following example from the book: https://mlr3book.mlr-org.com/binary-classification.html

library(mlr3)
library(mlr3viz)


data("Sonar", package = "mlbench")
task = as_task_classif(Sonar, target = "Class", positive = "R")

sum(Sonar$Class == "R") #97 instances of positive class
#> [1] 97

learner = lrn("classif.rpart", predict_type = "prob")
pred = learner$train(task)$predict(task)

pred$confusion #87/97
#>         truth
#> response  R  M
#>        R 87 16
#>        M 10 95

pred$score(msr("classif.sensitivity"))
#> classif.sensitivity 
#>           0.8969072

autoplot(pred, type = "roc")

roc

Created on 2021-07-01 by the reprex package (v2.0.0)

While there are 97 instances of the positive class, the roc curve plot suggests there are 111 instances of the positive class because it treats the negative class as the positive.

I think the problem is in the mlr3viz:::roc_data function, specifically the line

data.table(scores = prediction$prob[, 2L], labels = prediction$truth)

which implies the second column of the prob slot is the positive class, however in PredictionClassif objects the first column corresponds to the positive class:

head(pred$prob)
#output

             R         M
[1,] 0.8939394 0.1060606
[2,] 0.2666667 0.7333333
[3,] 1.0000000 0.0000000
[4,] 0.8939394 0.1060606
[5,] 0.0750000 0.9250000
[6,] 1.0000000 0.0000000

This is most likely related to: #72

Thank you,

Milan

ggplot updates

Some ggplot-related code should be updated in this package, i.e. see example below:

library(mlr3verse)
#> Loading required package: mlr3

task = tsk('spam')
learner = lrn('classif.rpart')

p = learner$train(task)$predict(task)
autoplot(p)
#> Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
#> ℹ Please use `after_stat(count)` instead.
#> ℹ The deprecated feature was likely used in the mlr3viz package.
#>   Please report the issue at <https://github.com/mlr-org/mlr3viz/issues>.

Created on 2022-11-25 with reprex v2.0.2

Make autoplot more modular

We should set up a dictionary with plot functions so that we can easily extend the provided functionality.

autoplot.PredictionRegr type xy, residual plot instead?

The current autoplot for regression generates this plot:

library(mlr3)
library(mlr3learners)
library(mlr3viz)
task3 = tsk("boston_housing")
learner = lrn("regr.ranger", min.node.size = 50)$train(task3)
p = learner$predict(task3)
autoplot(p)

image
but the following would be more helpful:
image

Is there any advantage of the plot above to the suggested one? The only one I can think of is that the latter is a tiny bit harder to understand. But if we want to keep the first one we should at least add a y=x line.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.