swager / randomforestci Goto Github PK

This package is DEPRECATED. Please use the packages `grf` or `ranger` instead, which have built-in confidence intervals.

Home Page: https://github.com/swager/grf

License: MIT License

R 100.00%

randomforestci's Introduction

randomForestCI

🔴 This package is deprecated. Please use one of the following packages instead: 🔴

grf, which has built-in support for resampling-based confidence intervals, or
ranger, which has an actively maintained version of the infinitesimal jackknife for random forests.

🔴 Both packages are available from CRAN. 🔴

Confidence intervals for random forests using the infinitesimal jackknife, as developed by Efron (2014) and Wager et al. (2014).

To install this package in R, run the following commands:

install.packages("devtools")
library(devtools) 
install_github("swager/randomForestCI")

Example usage:

library(randomForestCI)

# Make some data...
n = 250
p = 100
X = matrix(rnorm(n * p), n, p)
Y = rnorm(n)
  
#  Run the method
rf = randomForest(X, Y, keep.inbag = TRUE)
ij = randomForestInfJack(rf, X, calibrate = TRUE)

plot(ij)

References

Efron, Bradley. Estimation and accuracy after model selection. Journal of the American Statistical Association, 109(507), 2014. [link]

Wager, Stefan, Trevor Hastie, and Bradley Efron. Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. The Journal of Machine Learning Research, 15(1), 2014. [link]

randomforestci's People

Contributors

Stargazers

Watchers

randomforestci's Issues

Question about Classification

Hi Professor Wager,

Thank you so much for your work. It has helped me a great deal.

I have read in your paper (Wager, Hastie, Efron (2012)) that infinitesimal jackknife can be applied to classification problems, such as in the email spam example. My question is, can 'randomForestCI' be used for such a purpose?

Having built a random forest model for predicting a categorical variable, I obtain one 'y.hat' and one 'var.hat' from running 'randomForestInfJack'. I expected there would be a separate variance related to the probability estimate of being in a class, so I wonder if you would mind illuminate on whether I may use this output in my case?

Thank you for your help.

Sincerely,
Jinzhao

prediction and variance for new data

I would like to obtain a variance estimate for a new observation (randomForestInfJack). An observation that was not in X, when creating the random forest (randomForest(X,Y,keep.inbag=T)). It is not clear to me if this should 1) be done, or if it is valid should the new observation be added to the original data set X and then run randomForestInfJack()?
Dan O.

Standard error of transformed predictions using randomForestInfJack

I would like to get the standard error of predictions from a random forest trained on y=log(x+1) target. The var.hat returned by randomForestInfJack is the variance of y, so se=sqrt(var.hat). Is there a way to get the variance of x given var(y) so that I can get se(x)? Thank you.

randomForestInfJack question

I would like to supply a standard error and or confidence intervals with predictions from a random forest. I found your package and have a simple question, does the randomForestInfJack output the standard error or the variance in var.hat column. Can one then just use the general 2*sqrt(var.hat) to get a confidence interval. Thank you for your work and help.

Dan

cran?

curious to know when/if this will end up on cran. i have a package which imports it (and your forked version of randomForest).

Interpolation problem

Hello,

I am using your package to have confidence interval but i have a problem when I want to calibrate my variance. I get the error

Error in approx(x = calib.x, y = calib.y, xout = vars) :
need at least two non-NA values to interpolate

I don't understand what is the reason so I try to change the ntree but in vain.

Could someone help me PLZ

NAs produced when converting classification predictions to numeric

In the infinitesimalJackknife.R the following code is supposed to convert a classification prediction to numeric values:

predictions = predict(rf, newdata, predict.all = TRUE)
pred = predictions$individual
# in case of classification, convert character labels to numeric (!)
class(pred) = "numeric"

However this produces NAs when I try it locally:

class_matrix <- matrix(sample(c("Yes", "No"), size = 30, replace = T, prob = c(.3, .7)), nrow = 10)
head(class_matrix)
#>      [,1]  [,2]  [,3] 
#> [1,] "No"  "No"  "No" 
#> [2,] "No"  "No"  "Yes"
#> [3,] "No"  "No"  "No" 
#> [4,] "No"  "No"  "No" 
#> [5,] "Yes" "Yes" "Yes"
#> [6,] "Yes" "Yes" "Yes"
class(class_matrix) = "numeric"
#> Warning in class(class_matrix) = "numeric": NAs introduced by coercion
head(class_matrix)
#>      [,1] [,2] [,3]
#> [1,]   NA   NA   NA
#> [2,]   NA   NA   NA
#> [3,]   NA   NA   NA
#> [4,]   NA   NA   NA
#> [5,]   NA   NA   NA
#> [6,]   NA   NA   NA

Am I missing something of what it is supposed to do? This is a very dirty/quick way to properly convert a character matrix to numeric:

class_matrix <- matrix(sample(c("Yes", "No"), size = 30, replace = T, prob = c(.3, .7)), nrow = 10)
numeric_matrix <- 1 * (class_matrix == class_matrix[1,1])
head(numeric_matrix)
#>      [,1] [,2] [,3]
#> [1,]    1    1    0
#> [2,]    0    1    0
#> [3,]    1    1    0
#> [4,]    1    1    0
#> [5,]    0    0    0
#> [6,]    0    0    0

swager / randomforestci Goto Github PK

randomforestci's Introduction

randomForestCI

References

randomforestci's People

Contributors

Stargazers

Watchers

Forkers

randomforestci's Issues

Question about Classification

prediction and variance for new data

Standard error of transformed predictions using randomForestInfJack

randomForestInfJack question

cran?

Interpolation problem

NAs produced when converting classification predictions to numeric

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent