Giter Club home page Giter Club logo

randomforestci's Introduction

randomForestCI

๐Ÿ”ด This package is deprecated. Please use one of the following packages instead: ๐Ÿ”ด

  • grf, which has built-in support for resampling-based confidence intervals, or
  • ranger, which has an actively maintained version of the infinitesimal jackknife for random forests.

๐Ÿ”ด Both packages are available from CRAN. ๐Ÿ”ด

Confidence intervals for random forests using the infinitesimal jackknife, as developed by Efron (2014) and Wager et al. (2014).

To install this package in R, run the following commands:

install.packages("devtools")
library(devtools) 
install_github("swager/randomForestCI")

Example usage:

library(randomForestCI)

# Make some data...
n = 250
p = 100
X = matrix(rnorm(n * p), n, p)
Y = rnorm(n)
  
#  Run the method
rf = randomForest(X, Y, keep.inbag = TRUE)
ij = randomForestInfJack(rf, X, calibrate = TRUE)

plot(ij)

References

Efron, Bradley. Estimation and accuracy after model selection. Journal of the American Statistical Association, 109(507), 2014. [link]

Wager, Stefan, Trevor Hastie, and Bradley Efron. Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. The Journal of Machine Learning Research, 15(1), 2014. [link]

randomforestci's People

Contributors

alionaber avatar brianstock avatar swager avatar zmjones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

randomforestci's Issues

Question about Classification

Hi Professor Wager,

Thank you so much for your work. It has helped me a great deal.

I have read in your paper (Wager, Hastie, Efron (2012)) that infinitesimal jackknife can be applied to classification problems, such as in the email spam example. My question is, can 'randomForestCI' be used for such a purpose?

Having built a random forest model for predicting a categorical variable, I obtain one 'y.hat' and one 'var.hat' from running 'randomForestInfJack'. I expected there would be a separate variance related to the probability estimate of being in a class, so I wonder if you would mind illuminate on whether I may use this output in my case?

Thank you for your help.

Sincerely,
Jinzhao

prediction and variance for new data

I would like to obtain a variance estimate for a new observation (randomForestInfJack). An observation that was not in X, when creating the random forest (randomForest(X,Y,keep.inbag=T)). It is not clear to me if this should 1) be done, or if it is valid should the new observation be added to the original data set X and then run randomForestInfJack()?
Dan O.

Standard error of transformed predictions using randomForestInfJack

I would like to get the standard error of predictions from a random forest trained on y=log(x+1) target. The var.hat returned by randomForestInfJack is the variance of y, so se=sqrt(var.hat). Is there a way to get the variance of x given var(y) so that I can get se(x)? Thank you.

randomForestInfJack question

I would like to supply a standard error and or confidence intervals with predictions from a random forest. I found your package and have a simple question, does the randomForestInfJack output the standard error or the variance in var.hat column. Can one then just use the general 2*sqrt(var.hat) to get a confidence interval. Thank you for your work and help.

Dan

cran?

curious to know when/if this will end up on cran. i have a package which imports it (and your forked version of randomForest).

Interpolation problem

Hello,

I am using your package to have confidence interval but i have a problem when I want to calibrate my variance. I get the error

Error in approx(x = calib.x, y = calib.y, xout = vars) :
need at least two non-NA values to interpolate

I don't understand what is the reason so I try to change the ntree but in vain.

Could someone help me PLZ

NAs produced when converting classification predictions to numeric

In the infinitesimalJackknife.R the following code is supposed to convert a classification prediction to numeric values:

predictions = predict(rf, newdata, predict.all = TRUE)
pred = predictions$individual
# in case of classification, convert character labels to numeric (!)
class(pred) = "numeric"

However this produces NAs when I try it locally:

class_matrix <- matrix(sample(c("Yes", "No"), size = 30, replace = T, prob = c(.3, .7)), nrow = 10)
head(class_matrix)
#>      [,1]  [,2]  [,3] 
#> [1,] "No"  "No"  "No" 
#> [2,] "No"  "No"  "Yes"
#> [3,] "No"  "No"  "No" 
#> [4,] "No"  "No"  "No" 
#> [5,] "Yes" "Yes" "Yes"
#> [6,] "Yes" "Yes" "Yes"
class(class_matrix) = "numeric"
#> Warning in class(class_matrix) = "numeric": NAs introduced by coercion
head(class_matrix)
#>      [,1] [,2] [,3]
#> [1,]   NA   NA   NA
#> [2,]   NA   NA   NA
#> [3,]   NA   NA   NA
#> [4,]   NA   NA   NA
#> [5,]   NA   NA   NA
#> [6,]   NA   NA   NA

Am I missing something of what it is supposed to do? This is a very dirty/quick way to properly convert a character matrix to numeric:

class_matrix <- matrix(sample(c("Yes", "No"), size = 30, replace = T, prob = c(.3, .7)), nrow = 10)
numeric_matrix <- 1 * (class_matrix == class_matrix[1,1])
head(numeric_matrix)
#>      [,1] [,2] [,3]
#> [1,]    1    1    0
#> [2,]    0    1    0
#> [3,]    1    1    0
#> [4,]    1    1    0
#> [5,]    0    0    0
#> [6,]    0    0    0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.