mlr-org / mlr3viz Goto Github PK

View Code? Open in Web Editor NEW

41.0 12.0 8.0 226.94 MB

Visualizations for mlr3

Home Page: https://mlr3viz.mlr-org.com

License: GNU Lesser General Public License v3.0

R 100.00%

mlr3 visualization r ggplot2 visualizations r-package

mlr3viz's Introduction

mlr3viz

Package website: release | dev

mlr3viz is the visualization package of the mlr3 ecosystem. It features plots for mlr3 objects such as tasks, learners, predictions, benchmark results, tuning instances and filters via the autoplot() generic of ggplot2. The package draws plots with the viridis color palette and applies the minimal theme. Visualizations include barplots, boxplots, histograms, ROC curves, and Precision-Recall curves.

The gallery features a showcase post of the plots in mlr3viz.

Installation

Install the last release from CRAN:

install.packages("mlr3")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3viz")

Resources

The gallery features a showcase post of the visualization functions mlr3viz.

Short Demo

library(mlr3)
library(mlr3viz)

task = tsk("pima")
learner = lrn("classif.rpart", predict_type = "prob")
rr = resample(task, learner, rsmp("cv", folds = 3), store_models = TRUE)

# Default plot for task
autoplot(task, type = "target")

# ROC curve for resample result
autoplot(rr, type = "roc")

For more example plots you can have a look at the pkgdown references of the respective functions.

mlr3viz's People

Contributors

Stargazers

Watchers

Forkers

tpielok imarcello wulixin damirpolat minghao2016 henrifnk lorenzwalthert ck37

mlr3viz's Issues

`autoplot()` for cluster learner fails

Description

Source: "Introducing mlr3cluster: Cluster Analysis Package" gallery post

@damirpolat Would you mind having a look?

Reproducible example

library(mlr3)
library(mlr3cluster)
library(mlr3viz)

task <- mlr_tasks$get("usarrests")
learner <- mlr_learners$get("clust.agnes")
learner$train(task)

# Simple dendrogram
# autoplot(learner)

# More advanced options from `factoextra::fviz_dend`
autoplot(learner,
  k = learner$param_set$values$k, rect_fill = TRUE,
  rect = TRUE, rect_border = c("red", "cyan")
)
#> Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
#> "none")` instead.
#> Error in if (color == "cluster") color <- "default": the condition has length > 1

^{Created on 2022-07-09 by the reprex package (v2.0.1)}

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Monterey 12.4
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Zurich
#>  date     2022-07-09
#>  pandoc   2.18 @ /opt/homebrew/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package        * version     date (UTC) lib source
#>  abind            1.4-5       2016-07-21 [1] CRAN (R 4.2.0)
#>  assertthat       0.2.1       2019-03-21 [1] CRAN (R 4.2.0)
#>  backports        1.4.1       2021-12-13 [1] CRAN (R 4.2.0)
#>  broom            1.0.0       2022-07-01 [1] CRAN (R 4.2.0)
#>  car              3.1-0       2022-06-15 [1] CRAN (R 4.2.0)
#>  carData          3.0-5       2022-01-06 [1] CRAN (R 4.2.0)
#>  checkmate        2.1.0       2022-04-21 [1] CRAN (R 4.2.0)
#>  cli              3.3.0       2022-04-25 [1] CRAN (R 4.2.0)
#>  clue             0.3-61      2022-05-30 [1] CRAN (R 4.2.0)
#>  cluster          2.1.3       2022-03-28 [1] CRAN (R 4.2.0)
#>  clusterCrit      1.2.8       2018-07-26 [1] CRAN (R 4.2.0)
#>  codetools        0.2-18      2020-11-04 [3] CRAN (R 4.2.1)
#>  colorspace       2.0-3       2022-02-21 [1] CRAN (R 4.2.0)
#>  crayon           1.5.1       2022-03-26 [1] CRAN (R 4.2.0)
#>  data.table       1.14.2      2021-09-27 [1] CRAN (R 4.2.0)
#>  DBI              1.1.3       2022-06-18 [1] CRAN (R 4.2.0)
#>  dendextend       1.16.0      2022-07-04 [1] CRAN (R 4.2.0)
#>  digest           0.6.29      2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr            1.0.9       2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis         0.3.2       2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate         0.15        2022-02-18 [1] CRAN (R 4.2.0)
#>  factoextra       1.0.7       2020-04-01 [1] CRAN (R 4.2.0)
#>  fansi            1.0.3       2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap          1.1.0       2021-01-25 [1] CRAN (R 4.2.0)
#>  fs               1.5.2       2021-12-08 [1] CRAN (R 4.2.0)
#>  future           1.26.1      2022-05-27 [1] CRAN (R 4.2.0)
#>  generics         0.1.3       2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2          3.3.6       2022-05-03 [1] CRAN (R 4.2.0)
#>  ggpubr           0.4.0       2020-06-27 [1] CRAN (R 4.2.0)
#>  ggrepel          0.9.1       2021-01-15 [1] CRAN (R 4.2.0)
#>  ggsignif         0.6.3       2021-09-09 [1] CRAN (R 4.2.0)
#>  globals          0.15.1      2022-06-24 [1] CRAN (R 4.2.0)
#>  glue             1.6.2       2022-02-24 [1] CRAN (R 4.2.0)
#>  gridExtra        2.3         2017-09-09 [1] CRAN (R 4.2.0)
#>  gtable           0.3.0       2019-03-25 [1] CRAN (R 4.2.0)
#>  highr            0.9         2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools        0.5.2       2021-08-25 [1] CRAN (R 4.2.0)
#>  knitr            1.39        2022-04-26 [1] CRAN (R 4.2.0)
#>  lgr              0.4.3       2021-09-16 [1] CRAN (R 4.2.0)
#>  lifecycle        1.0.1       2021-09-24 [1] CRAN (R 4.2.0)
#>  listenv          0.8.0       2019-12-05 [1] CRAN (R 4.2.0)
#>  magrittr         2.0.3       2022-03-30 [1] CRAN (R 4.2.0)
#>  mlr3           * 0.13.3-9000 2022-07-06 [1] Github (mlr-org/mlr3@f753fd9)
#>  mlr3cluster    * 0.1.3       2022-07-09 [1] Github (mlr-org/mlr3cluster@67eed02)
#>  mlr3misc         0.10.0      2022-01-11 [1] CRAN (R 4.2.0)
#>  mlr3viz        * 0.5.9       2022-05-30 [1] local
#>  munsell          0.5.0       2018-06-12 [1] CRAN (R 4.2.0)
#>  palmerpenguins   0.1.0       2020-07-23 [1] CRAN (R 4.2.0)
#>  paradox          0.9.0.9000  2022-07-06 [1] Github (mlr-org/paradox@3ce3c77)
#>  parallelly       1.32.0      2022-06-07 [1] CRAN (R 4.2.0)
#>  pillar           1.7.0       2022-02-01 [1] CRAN (R 4.2.0)
#>  pkgconfig        2.0.3       2019-09-22 [1] CRAN (R 4.2.0)
#>  purrr            0.3.4       2020-04-17 [1] CRAN (R 4.2.0)
#>  R.cache          0.15.0      2021-04-30 [1] CRAN (R 4.2.0)
#>  R.methodsS3      1.8.2       2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo             1.25.0      2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils          2.12.0      2022-06-28 [1] CRAN (R 4.2.0)
#>  R6               2.5.1       2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp             1.0.8.3     2022-03-17 [1] CRAN (R 4.2.0)
#>  rematch2         2.1.2       2020-05-01 [1] CRAN (R 4.2.0)
#>  reprex           2.0.1       2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang            1.0.3       2022-06-27 [1] CRAN (R 4.2.0)
#>  rmarkdown        2.14        2022-04-25 [1] CRAN (R 4.2.0)
#>  rstatix          0.7.0       2021-02-13 [1] CRAN (R 4.2.0)
#>  scales           1.2.0       2022-04-13 [1] CRAN (R 4.2.0)
#>  sessioninfo      1.2.2       2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi          1.7.6       2021-11-29 [1] CRAN (R 4.2.0)
#>  stringr          1.4.0       2019-02-10 [1] CRAN (R 4.2.0)
#>  styler           1.7.0       2022-03-13 [1] CRAN (R 4.2.0)
#>  tibble           3.1.7       2022-05-03 [1] CRAN (R 4.2.0)
#>  tidyr            1.2.0       2022-02-01 [1] CRAN (R 4.2.0)
#>  tidyselect       1.1.2       2022-02-21 [1] CRAN (R 4.2.0)
#>  utf8             1.2.2       2021-07-24 [1] CRAN (R 4.2.0)
#>  uuid             1.1-0       2022-04-19 [1] CRAN (R 4.2.0)
#>  vctrs            0.4.1       2022-04-13 [1] CRAN (R 4.2.0)
#>  viridis          0.6.2       2021-10-13 [1] CRAN (R 4.2.0)
#>  viridisLite      0.4.0       2021-04-13 [1] CRAN (R 4.2.0)
#>  withr            2.5.0       2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun             0.31        2022-05-10 [1] CRAN (R 4.2.0)
#>  yaml             2.3.5       2022-02-21 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Users/pjs/Library/R/arm64/4.2/library
#>  [2] /opt/R/4.2.1-arm64/Resources/site-library
#>  [3] /opt/R/4.2.1-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

plot for feature importance

Would it be possible to add a plot to visualize feature importances?

I have created such a function here, that takes the named feature importance vector and returns a barplot.

Add default plot for regr.km

precision-recall (and ROC) curve shows different values than the score function

When plotting the precision-recall curve (same problem with the ROC curve), the displayed values differ from the performance estimates computed with the score() function.
Is this a bug or did I miss any obvious mistake on my side?

library(mlr3verse)
library(ggplot2)
set.seed(3)

task = tsk("german_credit")
learner = lrn("classif.log_reg", predict_type = "prob")
rr = resample(task, learner, rsmp("holdout"))
p = rr$prediction()

p$set_threshold(0.5)
s = p$score(measures = list(msr("classif.precision"), msr("classif.recall")))

autoplot(p, type = "prc") + 
  geom_segment(aes(x = 0, y = s[1], xend = s[2], yend = s[1])) +
  geom_segment(aes(x = s[2], y = 0, xend = s[2], yend = s[1]))

> s
classif.precision    classif.recall 
        0.7984496         0.8841202

> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] mlr3_0.11.0     ggplot2_3.3.3   mlr3verse_0.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6           paradox_0.7.1        lattice_0.20-41     
 [4] listenv_0.8.0        palmerpenguins_0.1.0 assertthat_0.2.1    
 [7] digest_0.6.27        utf8_1.2.1           parallelly_1.24.0   
[10] R6_2.5.0             mlr3measures_0.3.1   backports_1.2.1     
[13] evaluate_0.14        pillar_1.5.1         rlang_0.4.10        
[16] mlr3fselect_0.5.1    uuid_0.1-4           rstudioapi_0.13     
[19] data.table_1.14.0    distr6_1.5.0         Matrix_1.3-2        
[22] checkmate_2.0.0      rmarkdown_2.7        labeling_0.4.2      
[25] splines_4.0.4        mlr3pipelines_0.3.4  munsell_0.5.0       
[28] compiler_4.0.4       set6_0.2.1           xfun_0.22           
[31] pkgconfig_2.0.3      globals_0.14.0       mlr3tuning_0.8.0    
[34] htmltools_0.5.1.1    tidyselect_1.1.0     tibble_3.1.0        
[37] mlr3data_0.3.1       lgr_0.4.2            mlr3cluster_0.1.1   
[40] mlr3misc_0.8.0       codetools_0.2-18     clusterCrit_1.2.8   
[43] fansi_0.4.2          future_1.21.0        crayon_1.4.1        
[46] dplyr_1.0.5          withr_2.4.1          grid_4.0.4          
[49] gtable_0.3.0         lifecycle_1.0.0      magrittr_2.0.1      
[52] scales_1.1.1         mlr3learners_0.4.5   mlr3proba_0.3.2     
[55] future.apply_1.7.0   farver_2.1.0         mlr3viz_0.5.3       
[58] renv_0.13.1          mlr3filters_0.4.1    ellipsis_0.3.1      
[61] bbotk_0.3.2          vctrs_0.3.6          generics_0.1.0      
[64] tools_4.0.4          glue_1.4.2           purrr_0.3.4         
[67] parallel_4.0.4       survival_3.2-7       yaml_2.2.1          
[70] clue_0.3-58          colorspace_2.0-0     cluster_2.1.0       
[73] R62S3_1.4.1          knitr_1.31           precrec_0.12.1

Add plots for TuningInstances

Blurring when plotting densities

There should be some blurring when plotting densities in autoplot such that the density in the background is also visible. I have seen this problem here: https://mlr3book.mlr-org.com/tasks.html in 2.2.5

There is also some overlapping of numbers here: https://mlr3book.mlr-org.com/train-predict.html in 2.4.6

Problem with plot_learner_prediction: "Objekt 'response' nicht gefunden"

We are currently in the i2ml class and some students run into an error when running this code. On my machine it works. We were not able to figure out what the reason is.

Please help 😬 🆘

library(mlbench)
#> Warning: Paket 'mlbench' wurde unter R Version 3.6.3 erstellt
library(mlr3)
library(mlr3learners)
library(mlr3viz)
#> Warning: Paket 'mlr3viz' wurde unter R Version 3.6.3 erstellt
library(ggplot2)
library(grid)

set.seed(123L)

spirals <- data.frame(mlbench.spirals(n = 500, sd = 0.1))

spirals_task <- TaskClassif$new(id = "spirals", backend = spirals, target = "classes")

lda_learner <-  lrn("classif.lda", predict_type = "prob")

plot_learner_prediction(lda_learner, spirals_task)
#> INFO  [14:36:05.013] Applying learner 'classif.lda' on task 'spirals' (iter 1/1)
#> Error in eval(bysub, x, parent.frame()): Objekt 'response' nicht gefunden

^{Created on 2020-03-17 by the reprex package (v0.3.0)}

Session info

sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
#> [5] LC_TIME=German_Germany.1252    
#> 
#> attached base packages:
#> [1] grid      stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] ggplot2_3.2.1           mlr3viz_0.1.1           mlr3learners_0.1.6-9000
#> [4] mlr3_0.1.8-9000         mlbench_2.1-1          
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.2         compiler_3.6.1     pillar_1.4.2      
#>  [4] highr_0.8          mlr3misc_0.1.8     tools_3.6.1       
#>  [7] digest_0.6.22      uuid_0.1-4         evaluate_0.14     
#> [10] tibble_2.1.3       checkmate_1.9.4    gtable_0.3.0      
#> [13] pkgconfig_2.0.3    rlang_0.4.1        yaml_2.2.0        
#> [16] xfun_0.10          withr_2.1.2        stringr_1.4.0     
#> [19] dplyr_0.8.3        knitr_1.25         tidyselect_0.2.5  
#> [22] glue_1.3.1         data.table_1.12.6  R6_2.4.0          
#> [25] rmarkdown_1.17     purrr_0.3.3        lgr_0.3.3         
#> [28] magrittr_1.5       mlr3measures_0.1.2 MASS_7.3-51.4     
#> [31] backports_1.1.5    scales_1.0.0       htmltools_0.4.0   
#> [34] assertthat_0.2.1   colorspace_1.4-1   paradox_0.1.0     
#> [37] stringi_1.4.3      lazyeval_0.2.2     munsell_0.5.0     
#> [40] crayon_1.3.4

ROC curves under diagonal with benchmark mlr3

`autoplot` function return somethings wrong. Eg. roc curve _under diagonal _

library(mlr3verse)
library(mlr3viz)
task = tsk("german_credit")
design = benchmark_grid(
  tasks = task,
  learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"),
                  predict_type = "prob", predict_sets = c("train", "test")),
  resamplings = rsmps("cv", folds = 3)
)

### Roc curve under diagonal ----------------------------
bmr = benchmark(design)
autoplot(bmr, type = "roc")

### Roc curve above diagonal ----------------------------
learner = lrn("classif.rpart", predict_type = "prob")
pred = learner$train(task)$predict(task)
autoplot(pred, type = "roc")
```

### May be error at this
```{r}
as_precrec.BenchmarkResult = function(object) { 
    posclass = levels(data$labels)[1L] # return NULL ----------------------------
  )
}

Add default plot for regr.rpart/classif.rpart/surv.rpart

I'd suggest using ggtree for this.

Plot benchmark result and identical learner with different hyperpars

This came up today and was confusing for a few people:

library(mlr3)
library(mlr3viz)
lrn_list = list(
  lrn("classif.log_reg", predict_type = "prob"),
  lrn("classif.kknn", predict_type = "prob", k = 3),
  lrn("classif.kknn", predict_type = "prob", k = 10)
)

bm_design = benchmark_grid(task = tsk("sonar"), resamplings = rsmp("cv", folds = 10), learners = lrn_list)
bench_result = benchmark(bm_design)
autoplot(bench_result)

Both knn learners are grouped together. I know that this can be fixed by setting ids but this is still easy to overlook as there are now 2 different learners combined in a single boxplot.

Imo. There should be 3 boxplots with two of them having the same name. (maybe do the faceting by the learner hash and overwrite the labels with the learner id afterward?)

Add default plot for classif.ranger/regr.ranger

Documentation on autoplot.PredictionRegr

Another question from the I2ML students:

In the plot we see a blue and grey line which do not necessarily overlap. The documentation does not say which line is which (nor does the book). Can you add this info and let us know?

Rename to mlr3vis?

Because "vizualization" is not a valid word.
This also applies to DESCRIPTION.

Incorrect count label positions in "2.4.6 Plotting Predictions" bar chart?

As seen in:
https://mlr3book.mlr-org.com/mlr3book.pdf

Add default plot for classif.log_reg

Autplot for Benchmarking (Change order of boxplots/tasks)

Hi!

Plotting my benchmark results, I would like to change the order of the boxplots from alphabetical order (default in autoplot function?) to another order.
So far, I was able to change the respective task's boxplots the following way (see code).
However, since recently, changing the levels of the task_id-factor variable, only changes the order of the title, but not the respective boxplots themselves.

How can I change the order of the boxplots?
Any help highly appreciated! Thanks a lot


library(mlr3)
library(mlr3viz)

tasks = tsks(c("pima", "sonar"))
learner = lrns(c("classif.featureless", "classif.rpart"),
               predict_type = "prob")
resampling = rsmps("cv")
bmr = benchmark(benchmark_grid(tasks, learner, resampling))

# AUC
bmr_boxplots_auc <- autoplot(bmr, measure = msr("classif.auc")) + 
  ggplot2::theme(axis.text.x = ggplot2::element_text(hjust = .5)) + 
  ggplot2::scale_x_discrete(labels =c("FL", "RF", "LR"))+
  ggplot2::ylab("AUC")+
  ggplot2::xlab("Learner")+
  ggplot2::theme_bw()+
  ggplot2::theme(panel.grid = element_blank(),
                 strip.background = element_blank()
  )
bmr_boxplots_auc
bmr_boxplots_auc$data$task_id <- as.factor(bmr_boxplots_auc$data$task_id) 
levels(bmr_boxplots_auc$data$task_id) = c("sonar", "pima")
bmr_boxplots_auc

Add default plot for classif.svm/regr.svm

Implement autoplot.Task(, type = "corrplot")

Inverse ROC

Thanks for a really great and useful package!

It seems that something might have changed in the way that the ROC curve is plotted using the autoplot() function - it's currently inverse of what you would expect whenever I use autoplot(rr, type = "roc").

If you take a look at the mlr3 book it also seems that it is happening here https://mlr3book.mlr-org.com/benchmarking.html

Add default plot for classif.xgboost/regr.xgboost

Not working with R 3.6.1

Error message from install.packages():

package ‘mlr3viz’ is not available (for R version 3.6.1)

Add default plot for classif.multinom

Add default plot for regr.lm

Drop reshape2 dependency

Pulled in via #97, but should be replaced before release.

Incorrect ordering of x-labels in autoplot.BenchmarkResult

library(mlr3learners)
library(mlr3viz)
library(mlr3)
task = tsk("boston_housing")

learners = lrns(c("regr.featureless", "regr.rpart", "regr.ranger"))
measure = msr("regr.rmse")
set.seed(1)
bmr = benchmark(benchmark_grid(task, learners, rsmp("cv", folds = 3)))
bmr$score(measure)[,c("learner_id", "regr.rmse")]
#>          learner_id regr.rmse
#> 1: regr.featureless  8.455316
#> 2: regr.featureless 10.222279
#> 3: regr.featureless  8.974639
#> 4:       regr.rpart  1.729365
#> 5:       regr.rpart  1.811437
#> 6:       regr.rpart  1.608334
#> 7:      regr.ranger  1.636241
#> 8:      regr.ranger  2.125320
#> 9:      regr.ranger  1.901388
mlr3viz:::autoplot.BenchmarkResult(bmr, measure = measure)

^{Created on 2020-10-01 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.3      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-10-01                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date       lib source                               
#>  assertthat     0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                       
#>  backports      1.1.9      2020-08-24 [1] CRAN (R 4.0.2)                       
#>  callr          3.4.4      2020-09-07 [1] CRAN (R 4.0.2)                       
#>  checkmate      2.0.0      2020-02-06 [1] CRAN (R 4.0.0)                       
#>  cli            2.0.2      2020-02-28 [1] CRAN (R 4.0.0)                       
#>  codetools      0.2-16     2018-12-24 [1] CRAN (R 4.0.2)                       
#>  colorspace     1.4-1      2019-03-18 [1] CRAN (R 4.0.0)                       
#>  crayon         1.3.4      2017-09-16 [1] CRAN (R 4.0.0)                       
#>  curl           4.3        2019-12-02 [1] CRAN (R 4.0.0)                       
#>  data.table     1.13.0     2020-07-24 [1] CRAN (R 4.0.2)                       
#>  desc           1.2.0      2018-05-01 [1] CRAN (R 4.0.0)                       
#>  devtools       2.3.0      2020-04-10 [1] CRAN (R 4.0.0)                       
#>  digest         0.6.25     2020-02-23 [1] CRAN (R 4.0.0)                       
#>  dplyr          1.0.2      2020-08-18 [1] CRAN (R 4.0.2)                       
#>  ellipsis       0.3.1      2020-05-15 [1] CRAN (R 4.0.2)                       
#>  evaluate       0.14       2019-05-28 [1] CRAN (R 4.0.0)                       
#>  fansi          0.4.1      2020-01-08 [1] CRAN (R 4.0.0)                       
#>  farver         2.0.3      2020-01-16 [1] CRAN (R 4.0.0)                       
#>  fs             1.4.1      2020-04-04 [1] CRAN (R 4.0.0)                       
#>  future         1.18.0     2020-07-09 [1] CRAN (R 4.0.2)                       
#>  future.apply   1.6.0      2020-07-01 [1] CRAN (R 4.0.2)                       
#>  generics       0.0.2      2018-11-29 [1] CRAN (R 4.0.0)                       
#>  ggplot2        3.3.2      2020-06-19 [1] CRAN (R 4.0.2)                       
#>  globals        0.12.5     2019-12-07 [1] CRAN (R 4.0.0)                       
#>  glue           1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                       
#>  gtable         0.3.0      2019-03-25 [1] CRAN (R 4.0.0)                       
#>  highr          0.8        2019-03-20 [1] CRAN (R 4.0.0)                       
#>  htmltools      0.4.0      2019-10-04 [1] CRAN (R 4.0.0)                       
#>  httr           1.4.1      2019-08-05 [1] CRAN (R 4.0.0)                       
#>  knitr          1.29       2020-06-23 [1] CRAN (R 4.0.2)                       
#>  labeling       0.3        2014-08-23 [1] CRAN (R 4.0.0)                       
#>  lattice        0.20-41    2020-04-02 [1] CRAN (R 4.0.2)                       
#>  lgr            0.3.4      2020-03-20 [1] CRAN (R 4.0.0)                       
#>  lifecycle      0.2.0      2020-03-06 [1] CRAN (R 4.0.0)                       
#>  listenv        0.8.0      2019-12-05 [1] CRAN (R 4.0.0)                       
#>  magrittr       1.5        2014-11-22 [1] CRAN (R 4.0.0)                       
#>  Matrix         1.2-18     2019-11-27 [1] CRAN (R 4.0.2)                       
#>  memoise        1.1.0      2017-04-21 [1] CRAN (R 4.0.0)                       
#>  mime           0.9        2020-02-04 [1] CRAN (R 4.0.0)                       
#>  mlr3         * 0.6.0-9000 2020-09-14 [1] Github (mlr-org/mlr3@f7fb636)        
#>  mlr3learners * 0.3.0      2020-09-12 [1] Github (mlr-org/mlr3learners@50b4169)
#>  mlr3measures   0.2.0      2020-06-27 [1] CRAN (R 4.0.2)                       
#>  mlr3misc       0.5.0      2020-08-13 [1] CRAN (R 4.0.2)                       
#>  mlr3viz      * 0.2.0-9000 2020-09-10 [1] Github (mlr-org/mlr3viz@8d1f98c)     
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.0.0)                       
#>  paradox        0.4.0-9000 2020-09-09 [1] Github (mlr-org/paradox@e10d740)     
#>  pillar         1.4.6      2020-07-10 [1] CRAN (R 4.0.2)                       
#>  pkgbuild       1.1.0      2020-07-13 [1] CRAN (R 4.0.2)                       
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.0.0)                       
#>  pkgload        1.1.0      2020-05-29 [1] CRAN (R 4.0.2)                       
#>  prettyunits    1.1.1      2020-01-24 [1] CRAN (R 4.0.0)                       
#>  processx       3.4.4      2020-09-03 [1] CRAN (R 4.0.2)                       
#>  ps             1.3.4      2020-08-11 [1] CRAN (R 4.0.2)                       
#>  purrr          0.3.4      2020-04-17 [1] CRAN (R 4.0.0)                       
#>  R6             2.4.1      2019-11-12 [1] CRAN (R 4.0.0)                       
#>  ranger         0.12.1     2020-01-10 [1] CRAN (R 4.0.0)                       
#>  Rcpp           1.0.5      2020-07-06 [1] CRAN (R 4.0.0)                       
#>  remotes        2.1.1      2020-02-15 [1] CRAN (R 4.0.0)                       
#>  rlang          0.4.7      2020-07-09 [1] CRAN (R 4.0.2)                       
#>  rmarkdown      2.3        2020-06-18 [1] CRAN (R 4.0.2)                       
#>  rpart          4.1-15     2019-04-12 [1] CRAN (R 4.0.2)                       
#>  rprojroot      1.3-2      2018-01-03 [1] CRAN (R 4.0.0)                       
#>  scales         1.1.1      2020-05-11 [1] CRAN (R 4.0.2)                       
#>  sessioninfo    1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                       
#>  stringi        1.4.6      2020-02-17 [1] CRAN (R 4.0.0)                       
#>  stringr        1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                       
#>  testthat       2.3.2      2020-03-02 [1] CRAN (R 4.0.0)                       
#>  tibble         3.0.3      2020-07-10 [1] CRAN (R 4.0.2)                       
#>  tidyselect     1.1.0      2020-05-11 [1] CRAN (R 4.0.2)                       
#>  usethis        1.6.0      2020-04-09 [1] CRAN (R 4.0.0)                       
#>  uuid           0.1-4      2020-02-26 [1] CRAN (R 4.0.0)                       
#>  vctrs          0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                       
#>  withr          2.2.0      2020-04-20 [1] CRAN (R 4.0.0)                       
#>  xfun           0.16       2020-07-24 [1] CRAN (R 4.0.2)                       
#>  xml2           1.3.1      2020-04-09 [1] CRAN (R 4.0.0)                       
#>  yaml           2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                       
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Plot learner prediction broken

This is supposed to work, right?

library("mlr3")
library("mlr3viz")
gen = tgen("2dnormals")
gen$plot()

task = gen$generate(100)
plot_learner_prediction(lrn("classif.rpart"), task)

Move ggplot2 to Depends?

Currently I need to load _ggplot2: before autoplot() will be available.

I assume that all functions will use ggplot2 in the background anyways? Having it in Depends would trigger an auto-load when loading mlr3viz.

Add default plot for classif.naive_bayes

Threshold plot type in autoplot

Hello,

The threshold plot type in autoplot prints an error Error: Unknown plot type 'threshold'. I was searching for a way to plot a LIFT curve and came accros this issue.

library(mlr3)
library(mlr3viz)

task = tsk("spam")
learner = lrn("classif.rpart", predict_type = "prob")
object = learner$train(task)$predict(task)

head(fortify(object))
autoplot(object)
autoplot(object, type = "roc")
autoplot(object, type = "prc")
autoplot(object, type = "threshold") # Error

Best regards,
Mathieu

Add option to control show_cb argument in precrec objects

Is it an option to control show_cb to the autoplot function for precrec objects by parsing the argument through the ellipsis?

Add default plot for classif.lda

roc curves with mlr3::autoplot() for benchmark with “holdout” resampling

Hi,

This SO question highlights a problem with autoploting ROC curves from benchmark result when holdout is used.

If needed I can copy the example here?

All the best,

Milan

Testthat approach for plots

https://github.com/r-lib/vdiffr

LIFT curve

Hello,

I was wondering if there is a way to plot the LIFT curve associated with a binary classification model. It is possible to plot ROC curves using autoplot() but I could not find a way to plot LIFT curves.

Best regards,
Mathieu

plot function for learners

each learner should allow a simple plot function in its API
this should be an s3 method plot.class in mlr3viz
the function displays a "standard view" of the model, in ggplot or base R plot or ....
if some creates a local learner or a learner in the mlr3 learners online repo, this function
can also be defined locally / there

Add generic S3 counterparts for `autoplot()` functions

plotLearnerPrediction

We've used in our intro2ml course a lot plotLearnerPrediction() from mlr to visualize how certain classifier perform and how their decision boundaries differ, but there is no equivalent feature in mlr3.

bmr plot broken

library(mlr3)
library(mlr3viz)

# benchmarking with benchmark_grid()
design = benchmark_grid(tsk("iris"), lrn("classif.rpart"), rsmp("cv", folds = 3))
bmr = benchmark(design)

design2 = benchmark_grid(tsk("sonar"), lrn("classif.featureless"), rsmp("cv", folds = 3))
bmr2 = benchmark(design2)

bmr$combine(bmr2)
plot(bmr)

This also happens in more realistic scenarios.

I tried to fix it here:

mlr3viz/R/BenchmarkResult.R

Line 57 in 777fb80

learner_labels = unique(tab$learner_id)

by adding names(learner_labels) = unique(tab$nr) whihc would work for the given example but the test fails.
The test fail indicates that there is no bijective mapping between tab$nr and learner_labels. If there is no such mapping it points to a further problem that the labels cannot be correct.

Fix CI for older R versions

Add default plot for classif.kknn/regr.kknn

We could visualize clusters for numeric 2d problems, so basically plot_learner_prediction(). Maybe not worth it.

ROC curve for `PredictionClassif` objects switches the classes.

Hi there,

consider the following example from the book: https://mlr3book.mlr-org.com/binary-classification.html

library(mlr3)
library(mlr3viz)


data("Sonar", package = "mlbench")
task = as_task_classif(Sonar, target = "Class", positive = "R")

sum(Sonar$Class == "R") #97 instances of positive class
#> [1] 97

learner = lrn("classif.rpart", predict_type = "prob")
pred = learner$train(task)$predict(task)

pred$confusion #87/97
#>         truth
#> response  R  M
#>        R 87 16
#>        M 10 95

pred$score(msr("classif.sensitivity"))
#> classif.sensitivity 
#>           0.8969072

autoplot(pred, type = "roc")

Created on 2021-07-01 by the reprex package (v2.0.0)

While there are 97 instances of the positive class, the roc curve plot suggests there are 111 instances of the positive class because it treats the negative class as the positive.

I think the problem is in the mlr3viz:::roc_data function, specifically the line

data.table(scores = prediction$prob[, 2L], labels = prediction$truth)

which implies the second column of the prob slot is the positive class, however in PredictionClassif objects the first column corresponds to the positive class:

head(pred$prob)
#output

             R         M
[1,] 0.8939394 0.1060606
[2,] 0.2666667 0.7333333
[3,] 1.0000000 0.0000000
[4,] 0.8939394 0.1060606
[5,] 0.0750000 0.9250000
[6,] 1.0000000 0.0000000

This is most likely related to: #72

Thank you,

Milan

ggplot updates

Some ggplot-related code should be updated in this package, i.e. see example below:

library(mlr3verse)
#> Loading required package: mlr3

task = tsk('spam')
learner = lrn('classif.rpart')

p = learner$train(task)$predict(task)
autoplot(p)
#> Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
#> ℹ Please use `after_stat(count)` instead.
#> ℹ The deprecated feature was likely used in the mlr3viz package.
#>   Please report the issue at <https://github.com/mlr-org/mlr3viz/issues>.

^{Created on 2022-11-25 with reprex v2.0.2}

Apply guide how to use {ggplot2} in packages

From the devs themselves: https://ggplot2.tidyverse.org/dev/articles/ggplot2-in-packages.html

We should change our code to follow this design.

Add default plot for classif.glmnet/classif.cv_glmnet/regr.glmnet/regr.cv_glmnet/surv.glmnet/surv.cv_glmnet

There is a base R plot generic:

ROC curve for PredictionClassif objects switches the classes

This issue is related to: #72 #75

I am not sure if the problem is really fixed? At least I still get the ROCs under the bisector after reinstalling the mlr3viz-package as well as using the development version?

Moreover, the example in the documentation might have to be updated:
https://mlr3viz.mlr-org.com/reference/autoplot.BenchmarkResult.html

Make autoplot more modular

We should set up a dictionary with plot functions so that we can easily extend the provided functionality.

autoplot.PredictionRegr type xy, residual plot instead?

The current autoplot for regression generates this plot:

library(mlr3)
library(mlr3learners)
library(mlr3viz)
task3 = tsk("boston_housing")
learner = lrn("regr.ranger", min.node.size = 50)$train(task3)
p = learner$predict(task3)
autoplot(p)

but the following would be more helpful:

Is there any advantage of the plot above to the suggested one? The only one I can think of is that the latter is a tiny bit harder to understand. But if we want to keep the first one we should at least add a y=x line.

tsk = mlr_tasks$get("iris")
autoplot(tsk$clone()$filter(1:5), type = "pairs")

mlr-org / mlr3viz Goto Github PK

mlr3viz's Introduction

mlr3viz

Installation

Resources

Short Demo

mlr3viz's People

Contributors

Stargazers

Watchers

Forkers

mlr3viz's Issues

Description

Reproducible example

autoplot function return somethings wrong. Eg. roc curve _under diagonal _

Recommend Projects

Recommend Topics

Recommend Org

`autoplot` function return somethings wrong. Eg. roc curve _under diagonal _