Giter Club home page Giter Club logo

proc's Introduction

R build status R build status AppVeyor build status Codecov coverage CRAN Version Downloads

pROC

An R package to display and analyze ROC curves.

For more information, see:

  1. Xavier Robin, Natacha Turck, Alexandre Hainard, et al. (2011) “pROC: an open-source package for R and S+ to analyze and compare ROC curves”. BMC Bioinformatics, 7, 77. DOI: 10.1186/1471-2105-12-77
  2. The official web page
  3. The CRAN page
  4. My blog
  5. The FAQ

Stable

The latest stable version is best installed from the CRAN:

install.packages("pROC")

Getting started

If you don't want to read the manual first, try the following:

Loading

library(pROC)
data(aSAH)

Basic ROC / AUC analysis

roc(aSAH$outcome, aSAH$s100b)
roc(outcome ~ s100b, aSAH)

Smoothing

roc(outcome ~ s100b, aSAH, smooth=TRUE) 

more options, CI and plotting

roc1 <- roc(aSAH$outcome,
            aSAH$s100b, percent=TRUE,
            # arguments for auc
            partial.auc=c(100, 90), partial.auc.correct=TRUE,
            partial.auc.focus="sens",
            # arguments for ci
            ci=TRUE, boot.n=100, ci.alpha=0.9, stratified=FALSE,
            # arguments for plot
            plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
            print.auc=TRUE, show.thres=TRUE)

    # Add to an existing plot. Beware of 'percent' specification!
    roc2 <- roc(aSAH$outcome, aSAH$wfns,
            plot=TRUE, add=TRUE, percent=roc1$percent)        

Coordinates of the curve

coords(roc1, "best", ret=c("threshold", "specificity", "1-npv"))
coords(roc2, "local maximas", ret=c("threshold", "sens", "spec", "ppv", "npv"))

Confidence intervals

# Of the AUC
ci(roc2)

# Of the curve
sens.ci <- ci.se(roc1, specificities=seq(0, 100, 5))
plot(sens.ci, type="shape", col="lightblue")
plot(sens.ci, type="bars")

# need to re-add roc2 over the shape
plot(roc2, add=TRUE)

# CI of thresholds
plot(ci.thresholds(roc2))

Comparisons

    # Test on the whole AUC
    roc.test(roc1, roc2, reuse.auc=FALSE)

    # Test on a portion of the whole AUC
    roc.test(roc1, roc2, reuse.auc=FALSE, partial.auc=c(100, 90),
             partial.auc.focus="se", partial.auc.correct=TRUE)

    # With modified bootstrap parameters
    roc.test(roc1, roc2, reuse.auc=FALSE, partial.auc=c(100, 90),
             partial.auc.correct=TRUE, boot.n=1000, boot.stratified=FALSE)

Sample size

    # Two ROC curves
    power.roc.test(roc1, roc2, reuse.auc=FALSE)
    power.roc.test(roc1, roc2, power=0.9, reuse.auc=FALSE)

    # One ROC curve
    power.roc.test(auc=0.8, ncases=41, ncontrols=72)
    power.roc.test(auc=0.8, power=0.9)
    power.roc.test(auc=0.8, ncases=41, ncontrols=72, sig.level=0.01)
    power.roc.test(ncases=41, ncontrols=72, power=0.9)

Getting Help

If you still can't find an answer, you can:

Development

Installing the development version

Download the source code from git, unzip it if necessary, and then type R CMD INSTALL pROC. Alternatively, you can use the devtools package by Hadley Wickham to automate the process (make sure you follow the full instructions to get started):

if (! requireNamespace("devtools")) install.packages("devtools")
devtools::install_github("xrobin/pROC@develop")

Check

To run all automated tests and R checks, including slow tests:

cd .. # Run from parent directory
VERSION=$(grep Version pROC/DESCRIPTION | sed "s/.\+ //")
R CMD build pROC
RUN_SLOW_TESTS=true R CMD check pROC_$VERSION.tar.gz

Or from an R command prompt with devtools:

devtools::check()

Tests

To run automated tests only from an R command prompt:

run_slow_tests <- TRUE  # Optional, include slow tests
devtools::test()

vdiffr

The vdiffr package is used for visual tests of plots.

To run all the test cases (incl. slow ones) from the command line:

run_slow_tests <- TRUE
devtools::test() # Must run the new tests
testthat::snapshot_review()

To run the checks upon R CMD check, set environment variable NOT_CRAN=1:

NOT_CRAN=1 RUN_SLOW_TESTS=true R CMD check pROC_$VERSION.tar.gz

Release steps

  1. Update Version and Date in DESCRIPTION
  2. Update version and date in NEWS
  3. Get new version to release: VERSION=$(grep Version pROC/DESCRIPTION | sed "s/.\+ //") && echo $VERSION
  4. Build & check package: R CMD build pROC && R CMD check --as-cran pROC_$VERSION.tar.gz
  5. Check with slow tests: NOT_CRAN=1 RUN_SLOW_TESTS=true R CMD check pROC_$VERSION.tar.gz
  6. Check with R-devel: rhub::check_for_cran()
  7. Check reverse dependencies: revdepcheck::revdep_check(num_workers=8, timeout = as.difftime(60, units = "mins"))
  8. Merge into master: git checkout master && git merge develop
  9. Create a tag on master: git tag v$VERSION && git push --tags
  10. Submit to CRAN

proc's People

Contributors

dslituiev avatar kirillseva avatar messersc avatar michaelchirico avatar mkranj avatar sieste avatar smsaladi avatar wzbillings avatar xrobin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

proc's Issues

ggroc with several aesthetics

Hi,

I am trying to plot two roc curves in the same figure. I would like to have different colour as well as different linetype for each. However, i can do only one at a time and not simultaneously. That is, i can have different colour but same linetype :

`ggroc(list(myrocglm, myrocrf), legacy.axes = T) + geom_abline(intercept = 0,slope = 1)`  

or different linetype but same color :

`ggroc(list(myrocglm, myrocrf), aes= "linetype", legacy.axes = T) + geom_abline(intercept = 0,slope = 1)` 

And if i try to add the color parameter in the fuction above, it works only for one value, i.e. color="red". For more i get the following error :

"Error: Aesthetics must be either length 1 or the same as the data (353): colour"

Thanks,
John

Bug in calculating DeLong's Theta - delongPlacements(roc)

Describe the bug

While re-running (repeating a colleague's analysis) ci.auc() I received the following error message :
pROC: error in calculating DeLong's theta: got 0.65441176470588235947 instead of 0.63622994652406417160. And was asked to report the bug. (sorry if report is not perfect - my first bug report, and under time pressure)

To Reproduce

  1. Session info - packages:

EDIT: posted the wrong list originally.

R version 3.5.0 (2018-04-23)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: openSUSE Leap 42.3

Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] pROC_1.12.1 epiR_0.9-96 survival_2.41-3 tableone_0.9.3
[5] xtable_1.8-2 doBy_4.6-1 ggplot2_2.2.1 someR_1.5.1

  1. What command did you run?
    Command - ci.aus()

  2. What data did you use? Use save(myData, file="data.RData") or save.image("data.RData")

pROC_bug.zip

  1. What error or output did you get?

Error in delongPlacements(roc) :
pROC: error in calculating DeLong's theta: got 0.65441176470588235947 instead of 0.63622994652406417160. Diagnostic data saved in pROC_bug.RData. Please report this bug to https://github.com/xrobin/pROC/issues.

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

Significance of a single ROC curve

It should be possible to calculate the significance of a single ROC curve.

This would test H_0: AUC = 0.5.

For a full AUC this should correspond to the Wilcoxon Test. For partial AUC we need to use bootstrapping. Something like this:

roc.test(roc(aSAH$outcome, aSAH$ndka))
roc.test(roc(aSAH$outcome, aSAH$ndka, partial.auc = c(1, 0.9)))

pROC 1.12.0 cannot deal with large datasets any longer

Describe the bug
This is a regression due to the fix in #25

To Reproduce

> response <- rbinom(1E5, 1, .5)
> predictor <- rnorm(1E5)
> rocobj <- roc(response, predictor)
Erreur : impossible d'allouer un vecteur de taille 74.5 Go
4: outer(thresholds, predictor, `==`) at roc.utils.R#119
3: roc.utils.thresholds(c(controls, cases), direction) at roc.R#316
2: roc.default(response, predictor) at roc.R#21
1: roc(response, predictor)

This is caused by the check for identical values:

if (any(o <- outer(thresholds, predictor, `==`))) {

There must be an other way to test for exact equality between two vectors safely.

multiclass.roc is confusing

There seems to be a lot of confusion around this function, what it does and how to use it.

In particular, it seems people would like to pass a "multiclass predictor", a matrix containing probabilities of each datapoint belonging to a class. See for instance this question on StackOverflow

Not sure anything can be saved here.

Add tests for bootstrap operations.

This should start with simple operations like var and later ci.coords. The following sub-steps will have to be taken:

  • Define a class of tests that can be controlled by an environment variable (the tests will be slow and may fail)
  • Establish the current expectation and standard deviation of each bootstrapped statistic
  • Make sure that the results are within N standard deviations of the margin (can be controlled with environment variable?)

Additional considerations:

  • The tests may fail and shouldn't be run upon normal testing / CRAN

Error in delongPlacements while calculating DeLong's theta

I was running a piece of code and my code threw this error message:

Error in delongPlacements(roc) :
A problem occured while calculating DeLong's theta: got 0.50057161522129678399 instead of 0.50032663726931247972. This is a bug in pROC, please report it to the maintainer.

Does anyone know what I should do?
Thanks

smooth.method="density" doesn't work in 'roc'

roc(aSAH$outcome, aSAH$ndka, smooth=TRUE, smooth.method="density")
Error in match.fun(paste("bw", bw, sep = "."))(roc$predictor) : 
  need at least 2 data points

This is because roc$predictor is not set at the time smooth.roc is called.

Make sure to re-enable tests by removing the skip_if call in test-roc.R once fixed.

NAMESPACE / S3

Hi,

we use multiclass.roc in mlr here:

https://github.com/berndbischl/mlr

It seems to be that auc.roc etc are S3 methods in your package. But you do not mark them as such in your NAMESPACE, which is probably incorrect.

In mlr this triggers now a bug, where we requireNamespace("pROC"), then call multiclass.roc, then this does not find auc.roc, although that function lives in the same package.

Could this please be fixed?

'drop' in 'coords' doesn't drop over 'ret' direction

coords(r.s100b, c(0.51, 0.2), input = "threshold", ret = "specificity", drop = TRUE)
coords(r.s100b, "local maximas", input = "threshold", ret = "specificity", drop = TRUE)

Both return a matrix with 1 row. Note: this is tested in test-coords.R but the test is skipped as it fails.

The documentation only mentions dropping over length(x), but also doesn't state that it won't drop if length(ret) == 1. The doc should be either updated to mention not dropping over ret, or updated to mention to drop over ret and the code updated accordingly.

This however is an api change and too close for 1.14.

coords is too slow with many thresholds

response <- rbinom(1E5, 1, .5)
predictor <- rnorm(1E5)
r <- roc(response, predictor)
system.time(coords(r, "a"))
utilisateur     système      écoulé 
     47.791       0.088      47.867 

I would expect it to complete more or less instantly.

Plotting outside of the plot area

Something goes wrong when setting par(mar=...), calling plot.roc, axis and plot.roc again with add=TRUE. Visible only when xlim/ylim are set (or maybe also with massive margins?)

Compare:

roc1 <- roc(aSAH$outcome, aSAH$wfns)
roc2 <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
axis(side=1)
plot(roc2, add=T)

With:

m1.roc <- roc(aSAH$outcome, aSAH$wfns)
m2.1.roc <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
plot(roc2, add=T)

or:

m1.roc <- roc(aSAH$outcome, aSAH$wfns)
m2.1.roc <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
plot(roc2, add=T)
axis(side=1)

power.roc.test

Hi,

First, thank you of this wonderfull package.

I am trying to use the function 'power.roc.test' from the development version of the package. I would like to compute the sample size needed to compare a single AUC to a theoric value.

Is it possible to define the theoric value of the AUC (For example, if the expected AUC is 0.9, and its theoric value is 0.8) ?

Best,

David

Infinite case value causes error

pROC generates an error whenever the list of case values contains Inf. I suspect this is related to issue #25 . I am using the most recent GitHub version of pROC (as of May 11).

To Reproduce
Steps to reproduce the behavior:

  1. Started with a fresh R session:

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.10

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4 yaml_2.1.18

  1. Run attached demo script (includes minimal data set).

pROC-inf-bug-test.txt

  1. Error message is:

Error in delongPlacements(roc) :
pROC: error in calculating DeLong's theta: got 0.73333333333333328152 instead of 0.65000000000000002220. Diagnostic data saved in pROC_bug.RData. Please report this bug to https://github.com/xrobin/pROC/issues.

ggroc() does not show subtitle or caption label

ggroc() does not show subtitle or caption label

To Reproduce

a <- 1:10
b <- rep(c(TRUE, FALSE), 5)
ggroc(roc(b ~ a)) + labs(title = "stairs", subtitle = "leading upstairs", caption = "from right to left leading downstairs")

Expected behavior
A graph of a step function displaying the contents of the subtitle and caption arguments somewhere.

ci.auc with method bootstrap that works in RStudio Cloud

Hi,

I am trying to use pROC in RStudio Cloud. The data I'm dealing with can only be accessed in a secure "datalab" environment designed by Statistics New Zealand, so using R on my personal computer is not possible and I doubt that Stats NZ will be able to support a different implementation just for me.

Using the DeLong method to get a confidence interval works no problem, but when I try something like:

ci.auc(roc_object, method = "bootstrap")

I get the error:

Error in structure(.External(.C_dotTclObjv, objv), class = "tckObj") : [tcl] invalid comman name "toplevel".

This appears to me to be similar to this issue in the old RStudio community. It appears that the tcltkpackage is the reason it doesn't work? Is that the case for the pROC package too?

Thanks for the great package, would love any feedback on whether I'm mistaken or whether a work around is possible!

Transpose coords

Coords returns a matrix with thresholds in columns, and the measurement in rows.

This has always been a bit weird but is becoming problematic with pipelines where a data.frame in transposed form would be better suited.

Expected behavior

> library(dplyr)
> roc(aSAH, outcome, wfns) %>% coords()
Setting levels: control = Good, case = Poor
Setting direction: controls < cases
                threshold specificity sensitivity
      -Inf   0.0000000   1.0000000
       1.5   0.5138889   0.9512195
       2.5   0.7916667   0.6585366
       3.5   0.8333333   0.6341463
       4.5   0.9444444   0.4390244
       Inf   1.0000000   0.0000000

Question: Specify (hardcode) negative outcome in advance

I want to calculate the AUC for many subgroups, one at a time (in a foreach loop).
From my understanding the direction can change every time depending on the relation of 0 vs. 1 in the outcome, so I don't see if the AUC would be under 0.5.

Is it possible to specify the "negative outcome" or the "positive outcome" in advance?
I am currently using a workaround like

direction_i <- if(mean(df_i[["outcome"]]) < 0.5) {">"} else {"<"}

in every step, which is ok for my use case.

However, in general I would find your package much more appealing, if this could be specified directly.
Am I missing sth. obvious?

Use doRNG and foreach for reproducible parallel bootstrapping

The plyr is old and newer, better options exist for parallel execution. The foreach package seems to be the way to go, with different backends available, and the doRNG package for reproducible parallel calculations.

Interface from the user perspective would look like:

cl <- makeCluster(2) # 2 cores
registerDoParallel(cl)
registerDoRNG(1234) 
ci(...)
stopCluster(cl)

Internally we would simply have:

resampled.values <- foreach(i=1:boot.n) %dopar% { stratified.bootstrap.test(...) }

instead of

resampled.values <- laply(1:boot.n, stratified.bootstrap.test, ...)

Things to consider:

  • Code should be able to run without any extra line of code from the user (but then not in parallel)
  • Progress bars?
  • What if some of the bootstrapping gets implemented in C++ in the future?

Fast Calculation for Area Under ROC curve

The area under the ROC curve can be calculated directly from a vector of predictions and a vector of binary labels using the Mann-Whitney U Test. Since this algorithm does not require calculating the ROC curve, it can provide a significant performance increase. My benchmarks on show that, on 10 thousand observations, this algorithm is 1,000 times faster than calculating AUROC with your package (2100 milliseconds seconds vs 2.3 milliseconds).

Would you be interested in adding a C++ implementation of this algorithm to your package? The speedup that this algorithm provides would be valuable for users who need to evaluate hundreds to thousands of models (e.g. with a grid search over a feature / hyper-parameter space).

If you are interested in this contributions to your package, please let me know.

DeLong AUC Confidence Interval

I think there may be an issue with the DeLong confidence interval for AUC. When the sample size gets large, the CI goes to either 0-0 or 1-1. Here is an example:

Create ROC Objects

predictor1 <- c(runif(12000,0,0.5), runif(14472-12000, 0.5,0.75))
response1 <- rbinom(14472, size=1, p=predictor1)
roc1 <- roc(response1, predictor1)

predictor2 <- c(runif(3 * 12000,0,0.5), runif(3 * (14472-12000), 0.5,0.75))
response2 <- rbinom(3 * 14472, size=1, p=predictor2)
roc2 <- roc(response2, predictor2)

predictor3 <- c(runif(10 * 12000,0,0.5), runif(10 * (14472-12000), 0.5,0.75))
response3 <- rbinom(10 * 14472, size=1, p=predictor3)
roc3 <- roc(response3, predictor3)

Calculate AUC and CI

auc(roc1)
Area under the curve: 0.7586
ci.auc(roc1)
95% CI: 0.7506-0.7667 (DeLong)

auc(roc2)
Area under the curve: 0.7584
ci.auc(roc2)
95% CI: 0.7537-0.7631 (DeLong)

auc(roc3)
Area under the curve: 0.7561
ci.auc(roc3)
95% CI: 1-1 (DeLong)

Implement CI for multiclass.roc

CI is broken for multiclass.roc:

data(aSAH)
multiclass.roc(aSAH$gos6, aSAH$s100b, ci=TRUE)
Error in roc.default(response, predictor, levels = X, percent = percent,  : 
  formal argument "ci" matched by multiple actual arguments

It is also not possible to calculate a CI on an existing object:

ci(multiclass.roc(aSAH$gos6, aSAH$s100b))
Error in roc.default(response, predictor, ...) : No valid data provided.

This should work easily for univariate multiclass.roc. The new mv.multiclass.roc might need a bit more work.

coordinates when smoothing

I think that there is an issue when using coords and a smoothed curve. The format of results are different between smooth and unsmoothed curves and I suspect that the threshold is being returned in place of the specificity when smoothing is used.

For example:

library(pROC)

data(aSAH)

roc_orig <- roc(aSAH$outcome, aSAH$s100b)
roc_smooth <- roc(aSAH$outcome, aSAH$s100b, smooth = TRUE)

## plots are not extremely different
plot(roc(aSAH$outcome, aSAH$s100b, smooth = TRUE))
plot(roc(aSAH$outcome, aSAH$s100b), add = TRUE, col = "red")

coord_orig <- t(coords(roc_orig, seq(0, 1, 0.01)))
coord_smooth <- t(coords(roc_smooth, seq(0, 1, 0.01)))
coord_smooth2 <- t(coords(smooth(roc_orig), seq(0, 1, 0.01)))

The results are very different:

> head(coord_orig)
     threshold specificity sensitivity
0         0.00  0.00000000   1.0000000
0.01      0.01  0.00000000   1.0000000
0.02      0.02  0.00000000   1.0000000
0.03      0.03  0.00000000   1.0000000
0.04      0.04  0.00000000   0.9756098
0.05      0.05  0.06944444   0.9756098
> head(coord_smooth)
     specificity sensitivity
0           0.00   1.0000000
0.01        0.01   0.9970265
0.02        0.02   0.9942254
0.03        0.03   0.9914151
0.04        0.04   0.9885741
0.05        0.05   0.9856905

Thanks,

Max

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pROC_1.8

loaded via a namespace (and not attached):
[1] plyr_1.8.4  tools_3.3.1 Rcpp_0.12.5

help with pROC installation on a Debian box (Rcpp related)

Hello,

i've a problem installing package pROC on my Debian testing

install.packages("pROC")
Installing package into
‘/home/l/R/x86_64-pc-linux-gnu-library/3.0’
(as ‘lib’ is unspecified)
provo con l'URL
'http://cran.mirror.garr.it/mirrors/CRAN/src/contrib/pROC_1.7.1.ta
r.gz'
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
connesso a 'cran.mirror.garr.it' sulla porta 80.
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
-> GET /mirrors/CRAN/src/contrib/pROC_1.7.1.tar.gz HTTP/1.0
Host: cran.mirror.garr.it
User-Agent: R (3.0.3 x86_64-pc-linux-gnu x86_64 linux-gnu)

Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- HTTP/1.1 200 OK
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Server: nginx/1.4.7
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Date: Fri, 28 Mar 2014 16:54:04 GMT
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Content-Type: text/plain
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Content-Length: 91857
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Last-Modified: Fri, 21 Feb 2014 04:39:58 GMT
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Connection: close
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- ETag: "5306d89e-166d1"
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Accept-Ranges: bytes
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
Code 200, content-type 'text/plain'
Content type 'text/plain' length 91857 bytes (89 Kb)
URL aperto

downloaded 89 Kb

Carico il pacchetto richiesto: splines
Warning in library(pkg, character.only = TRUE, logical.return =
TRUE, lib.loc = lib.loc) :
there is no package called ‘pROC’
Warning: package ‘yapomif’ in options("defaultPackages") was not
found

tools:::.install_packages()

  • installing source package ‘pROC’ ...
    ** package ‘pROC’ successfully unpacked and MD5 sums checked
    Warning in writeLines(paste0(c(out[is_not_empty]), eor), file) :
    stringa carattere non valida nella conversione dell'output
    ** libs
    g++ -I/usr/share/R/include -DNDEBUG
    -I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
    -c RcppExports.cpp -o RcppExports.o
    g++ -I/usr/share/R/include -DNDEBUG
    -I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
    -c delong.cpp -o delong.o
    g++ -I/usr/share/R/include -DNDEBUG
    -I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
    -c perfsAll.cpp -o perfsAll.o
    Carico il pacchetto richiesto: splines
    Error in library.dynam(lib, package, package.lib) :
    shared object ‘pROC.so’ not found
    Warning: package ‘yapomif’ in options("defaultPackages") was not
    found
    g++ -shared -o pROC.so RcppExports.o delong.o perfsAll.o >
    Rcpp:::LdFlags() > > -L/usr/lib/R/lib -lR
    Carico il pacchetto richiesto: splines
    Error in library.dynam(lib, package, package.lib) :
    shared object ‘pROC.so’ not found
    Warning: package ‘yapomif’ in options("defaultPackages") was not
    found
    g++: error: >: File o directory non esistente
    g++: error: Rcpp:::LdFlags(): File o directory non esistente
    g++: error: >: File o directory non esistente
    g++: error: >: File o directory non esistente
    make: *** [pROC.so] Error 1
    ERROR: compilation failed for package ‘pROC’
  • removing ‘/home/l/R/x86_64-pc-linux-gnu-library/3.0/pROC’
    Warning in install.packages("pROC") :
    installation of package ‘pROC’ had non-zero exit status

It seems to be a compilation problem, but Rcpp version is > than
that required (0.10.5)

packageVersion("Rcpp")
[1] ‘0.11.0’

A few infos...

                          sysname 
                          "Linux" 
                          release 
                   "3.10-2-amd64" 
                          version 

"#1 SMP Debian 3.10.7-1 (2013-08-17)"
nodename
"np350v5c"
machine
"x86_64"
login
"l"
user
"l"
effective_user
"l"

R.version
_
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 0.3
year 2014
month 03
day 06
svn rev 65126
language R
version.string R version 3.0.3 (2014-03-06)
nickname Warm Puppy

Any hint to solve the problem?

thank you,
Luca

Calculate auc in pROc

I'm trying to calculate auc based on a pROC package. I use the formula:

auc(set_temp$def_woe,set_temp$total_pymnt_woe)

Unfortunately, for some variables, I getting an error:
Error in if (thresholds[tie.idx] == unique.candidates[tie.idx -1]) { : argument is of length zero

ci.coords should fail more gracefully with ret = "threshold"

I believe the only supported use case of boostrapping threshold is with 'x = "best"'. For all other cases, pROC should produce a useful error message, not garbage like:

> ci.coords(roc1, x=0.8, input = "sensitivity", ret=c("specificity", "ppv", "tp", "thr"))
Error in apply(sapply(perfs, c), 1, quantile, probs = c(0 + (1 - conf.level)/2,  : 
  dim(X) must have a positive length
De plus : Warning message:
In ci.coords.roc(roc1, x = 0.8, input = "sensitivity", ret = c("specificity",  :
  NA value(s) produced during bootstrap were ignored.

or

> ci.coords(roc1, x=0.9, input = "sensitivity", ret="t")
95% CI (2000 stratified bootstrap replicates):
                           2.5% 50% 97.5%
sensitivity 0.9: threshold   NA  NA    NA
Warning message:
In ci.coords.roc(roc1, x = 0.9, input = "sensitivity", ret = "t") :
  NA value(s) produced during bootstrap were ignored.

In the longer term, work should continue on branch interpolate that will ultimately support this feature by interpolating thresholds.

Sample size calculation with wrong zalpha?

I tried to reproduce the sample size calculations in Table 4 of the Obuchowski paper (2004) for a single ROC curve. For a significance level of 0.05, an expected AUC of 0.7, a desired power of 0.9 and kappa = 1, the sample size calculation should result in 33 patients for each of the two groups.

However,
power.roc.test(auc=0.7, sig.level=0.05, power=0.9, kappa=1.0)
gives ncases = ncontrols = 40.21369 as a result.

Maybe the problem is that inside the function, the z-value for the significance level is calculated by
zalpha <- qnorm(sig.level),
which gives the lower alpha percentile (-1.64 instead of 1.64), not the upper one. I think it should be:
zalpha <- qnorm(sig.level, lower.tail = F) or, of course
zalpha <- qnorm(1 - sig.level)

Thank you very much for your work and for maintaining this great package!

Exporting AUC values from pROC

Hello, I've been using pROC in last few days. Very nice and works well for getting AUC out. However, I can't seem to find a way to extract AUC values out into txt or csv files. I was hoping to loop through several columns of an input datafile in order to calculate the AUC for each variable.

Many thanks for your help

Compare more than 2 ROC curves

I've probably missed this, but is there an option in pROC for comparing more than two ROC curves at the same time? I see that DeLong et al 1988 is a reference, but it seems like pROC is missing this ability. If so, could this be considered a feature request? Also amazing would be the ability to test multiple (>2) pAUCs at the same time. Thanks for the great addition to R!

Namespacing the functions?

Currently unable to use pROC::ci in my package without requiring the whole package.

# somewhere in function definition
#' @importFrom pROC ci ci.auc ci.roc roc
pROC::ci(factor(c(0, 1, 0, 1)), c(0.1, 0.2, 0.3, 0.4), of = 'auc')
# WARNING: Error in UseMethod("ci") :
#  no applicable method for 'ci' applied to an object of class "factor"

library(pROC)
pROC::ci(factor(c(0, 1, 0, 1)), c(0.1, 0.2, 0.3, 0.4), of = 'auc')
#95% CI: 0.05705-1 (DeLong)

Probably an issue with namespacing within method dispatch.

Change default direction from auto to <

Users seem confused by the auto-detection of the direction of a ROC curve. See this discussion and others. This issue discusses whether to change the default to '<'.

Pros:

  • auto can bias AUCs towards higher values. Can be an issue when resampling etc
  • '<' is consistent with thinking of the score as "probability to be a case"

Cons:

  • Why would < be less confusing? What if score is "probability to be a control"?

Handle changing xlim in plot.roc

The following piece of code in plot.roc handling legacy.axis fails to rescale with xlim:

        lab.at <- seq(1, 0, -0.2)
        if (x$percent) 
            lab.at <- lab.at * 100
        lab.labels <- lab.at
        if (legacy.axes) 
            lab.labels <- rev(lab.labels)

Implement `ret="all"` in coords

Followup of issue #40.

It might be useful to return every coordinate possible in coords. This could be done by adding a special ret value of "all" (verbatim).

Warning: this cannot be abbreviated, as this would change the current behavior of ret="a" which is to return accuracy. It cannot be mixed with any other value, and therefore only an exact match with a vector of length 1 should be allowed.

Optimize threshold determination with algorithm=2

Too much time is spent in roc.utils.R:60 in roc.utils.perfs.all.fast:

dups.sesp <- duplicated(matrix(c(se, sp), ncol=2), MARGIN=1)

There must be a better way to do it. Here is some benchmarking code:

n <- 1e6
dat <- data.frame(x = rnorm(n), y = sample(c(0:1), size = n, replace = TRUE))

library(profvis)
profvis({
	for (i in 1:10) {
		pROC::roc(dat$y, dat$x, algorithm = 2)
	}
	
})

Forthcoming release of ggplot2 and pROC

We are contacting you because you are the maintainer of pROC, which imports ggplot2 and uses vdiffr to manage visual test cases. The upcoming release of ggplot2 includes several improvements to plot rendering, including the ability to specify lineend and linejoin in geom_rect() and geom_tile(), and improved rendering of text. These improvements will result in subtle changes to your vdiffr dopplegangers when the new version is released.

Because vdiffr test cases do not run on CRAN by default, your CRAN checks will still pass. However, we suggest updating your visual test cases with the new version of ggplot2 as soon as possible to avoid confusion. You can install the development version of ggplot2 using remotes::install_github("tidyverse/ggplot2").

If you have any questions, let me know!

"DeLong's test should not be applied to ROC curves with a different direction"

Hi!

I am computing ROC curves to compare a new score to previously developed scores. Visually when looking at AUROC curves, the new score seems to outperfom all old scores. One of the old scores performs quite bad and reminds the letter S (AUC = 0.52 and half of the curve is under the line of identity and the other half is ontop of the line of identity). I receive an error message when trying to analysis it with Delong's test

roc.test(score_new, score_old_3, method = "delong")

"Warning message:
In roc.test.roc(score_new, score_old_3, method = "delong") :
DeLong's test should not be applied to ROC curves with a different direction."

According to Delong's test the new score is better than other scores (p<0.05), but with this bad-performing score_old_3, the p-value is 0.08. The problem remains with bootstrap and venkatraman. I do not thrust the results. How would you recommend me to analyze this?

Thanks for the help,
Oscar

ggroc x-axis

Is there an easy way to have ggroc plot the false positive rate (1 - specificity) on the x-axis?

By default ggroc plots specificity on the reversed x-axis over [1, 0], instead of the perhaps more familiar (1-specificity) over [0, 1].

Warning "no non-missing arguments" and nothing computed

I'm computing multiple ROC-curves. Two out of 12 computations give me a warning message and won't compute the desired variables without throwing an error.

Output of the function "roc" with rm.remove=TRUE of the two problematic functions:
"True Postive Rate: from Inf to -Inf
False Positive Rate: from Inf to -Inf
Area under Curve:
Maximum F1 Score: -Inf
Warning messages:
1: In min(x$tp) : no non-missing arguments to min; returning Inf
2: In max(x$tp) : no non-missing arguments to max; returning -Inf
3: In min(x$fp) : no non-missing arguments to min; returning Inf
4: In max(x$fp) : no non-missing arguments to max; returning -Inf
5: In max(x$F1) : no non-missing arguments to max; returning -Inf"

problematicRoc$auc gives "NULL".

Everything works as expected with the two computations when I explicitly state that I want auc and ci computed like "roc(....., auc=TRUE, ci=TRUE)".

I have no conflicting packages installed and to my knowledge the data im running roc on in the two problematic instances is not that different from the other ten instances where it works as expected.

I'm not sure how to reproduce the error but I'm glad to provide more detail.

(Thanks for this great package by the way!)

Biased AUC estimate ?

Thank you very much for your very useful pROC package.
I've noticed a curious result on which I'd like to draw your attention : when simulating samples with the same distribution for the classifier score for cases and controls, an AUC of 0.5 is expected on average. However, a slightly biased mean > 0.5 is provided by auc function of pROC, whereas both ROC and fbroc packages yield an identical mean value closer to 0.5 than pROC.
When comparing the individual AUCs estimated by the 3 packages, although pROC yields in some cases, the same AUCs as the 2 other packages, in some other cases, the result is different. The 2 others packages always give the same AUC estimate.
The following code illustrates this:
##############################################
rm(list=ls())
library(pROC)
library(ROC)
library(fbroc)

nsim <- 1000

result <- matrix(ncol=3,nrow=nsim)
n.cases <- n.controls <- 150

for (i in 1:nsim){

same distributions of scores for cases and controls

response.cases <- rnorm(n.cases, 6,50)
response.controls <- rnorm(n.controls, 6,50)

#############################################
pROC <- roc(controls=response.controls, cases=response.cases)
#############################################
ROC <- rocdemo.sca(truth=c(rep(1, n.cases), rep(0, n.controls)), data=c(response.cases, response.controls))
#############################################
fbroc <- boot.roc(pred=c(response.cases, response.controls), true.class=c(rep(TRUE, n.cases), rep(FALSE, n.controls)))
#############################################
result[i,] <- c(auc(pROC), AUC(ROC), fbroc$auc)

}

the mean is not 0.5 for pROC whereas it is 0.5 for the other 2 packages

apply(result, 2, mean)

##############################################

Many thanks if you can look into this issue.
Best regards,
Jacques

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.