koalaverse / sure Goto Github PK
View Code? Open in Web Editor NEWSurrogate residuals for cumulative link and general regression models in R
Home Page: https://koalaverse.github.io/sure/index.html
Surrogate residuals for cumulative link and general regression models in R
Home Page: https://koalaverse.github.io/sure/index.html
Will need to use Rcpp
or vectorize the truncated distribution functions; the latter might be the simpler approach.
This will avoid R CMD CHECK warnings about undefined globals.
Jittering on the response scale (jitter.scale = "response"
)seems to work fine, but seems problematic on the probability scale (jitter.scale = "probability"
).
Will need to add extensive tests: testthat
for typical unit tests, and longer tests stored in a slowtests
directory that checks the package against the examples from the JASA paper.
Now using:
#' @keywords internal
getMeanResponse.polr <- function(object) {
# qfun <- getQuantileFunction(object)
# fv <- object$fitted.values[, 1L, drop = TRUE]
# fv[fv == 0] <- 1e-05
# -qfun(fv)
object$lp - object$zeta[1L]
}
Something like
autoplot(fit, what = c("qq", "fitted"), ncol = 2)
The easiest way seems to be storing the plots in a list and then passing them to the grobs
argument in grid.arrange
.
plist <- list(p1, p2)
grid.arrange(grobs = plist, ncol = ncol)
Paper was accepted, so need to make sure there are no breaking changes!
Would it be possible to add support for ordinal models fit with the bayesian BRMS package:
# List required packages
Pkgs <- c("tidyverse", "ordinal", "brms", "rstan")
# Load packages ====
lapply(Pkgs, require, c = T)
# Load data
data <- ordinal::wine
# Model formula
fmla <- bf(Rating ~ temp + contact +temp:contact + 1|judge) )
# Compile and fit model
mod <- brm(fmla,
data = data,
family = cumulative("probit"),
inits = 0,
iter = 2000,
warmup = 1000,
chains = 2,
cores = 2,
sample_prior = TRUE,
save_all_pars = TRUE)
)
I found that an aliased parameter will throw an error while residual/surrogate calculation. I tried:
library(ordinal)
data(wine)
fmm1 <- clm(rating ~ 1 + judge + temp, data = wine)
fmm2 <- clm(rating ~ 1 + judge + bottle, data = wine)
fmm3 <- clm(rating ~ 1 + judge + temp + bottle, data = wine)
fmm4 <- clm(rating ~ 1 + judge + temp + contact, data = wine)
sure::resids(fmm3)
sure::surrogate(fmm3)
Only fmm3 produces the error:
Error in X[, -1L, drop = FALSE] %*% object$beta : non-conformable arguments
I could bypass the problem as following: I changed the code in generate_residuals from
drop(X[, -1L, drop = FALSE] %*% object$beta - object$alpha[1L])
to
drop(X[, -1L, drop = FALSE] %*% object$beta[!is.na(object$beta)] - object$alpha[1L])
This way I get residual values. I don't know if this is legit or if it's better to throw an error saying 'there was one unidentified parameter (aliased) so residuals are not valid'. Can we use the residuals even if we drop this value?
As far as I understood your code you intended to just drop the value in
if(sum(object$aliased$beta) > 0) { X <- X[, !c(FALSE, object$aliased$beta), drop = FALSE] }
Sincerely Simon
Will also need to eventually update the link and reference to Dungang's JASA paper.
Switch to ggplot2
for "gof"
objects for consistency.
heart <- data.frame(
ck = 0:11*40 + 20,
ha = c(2, 13, 30, 30, 21, 19, 18, 13, 19, 15, 7, 8),
ok = c(88, 26, 8, 5, 0, 1, 1, 1, 1, 0, 0, 0)
)
mod.0 <- glm(cbind(ha,ok) ~ ck, family = binomial(link = "logit"),
data = heart)
sure::resids(mod.0)
The autoplot
function doesn't work for VGAM objects (i.e., vglm and vgam fits).
@bradleyboehmke are you familiar with Travis CI? I can write the yaml
file to include, but you'll need to login with the AFIT-R GitHub account and flip the switch for the packages you want supported (ideally all of them). This will make sure that the packages are re-built and checked every time we push changes and make sure nothing is broken. We should also do the same with codecov.
autoplot
is not supporting object types. Need to change auto.default
to individual object methods (i.e. autoplot.polr
) or find another, more efficient approach.
Bring in @Andy-McCarthy's plotting functions. Following base R and ggplot2 convention, these should look similar to the following:
library(MASS)
fit <- polr(y ~ x, data = d, method = "logistic")
plot(fit, which = 1:2) # base plotting method
autoplot(fit, which = "qq") # ggplot2 method
Dear prof. Greenwell,
I am trying to use “sure” package to make diagnostics on a logit ordinal model generated with “svyolr” function (“survey” R package”), in particular i am trying to use “resids” function, but i get the error “Error in eval(predvars, data, env) : object "MR" not found”, where “MR” is the response variable (ordered factor) of the svyolr model.
I tryed to assign the class “polr” to the svyolr model (that is based on polr function) but the error persists.
I apologize if the question is trivial but I can’t get rid of the error.
Yours sincerely,
Please find below an example.
Fabio Maria Colombo
library(sure)
library(polr)
library(survey)
#Simulated dataset
RM=ordered(c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4))
data=data.frame(age=rnorm(mean=7,sd=2,49),
weight=rnorm(mean=10,sd=3,49),
nose.length=rnorm(49,mean=3,sd=1.5),
head.length=rnorm(49,mean=8,sd=2),
w=abs(1+trunc(rnorm(mean=2, sd=2,49))))
#survey design
svydesign<-svydesign(id=~1,data=data,weights=~w)
#svyolr model
model=svyolr(RM~nose.length+head.length,
design=svydesign
)
#resids(sure) function
resids(model)
class(model)='polr' #assigns the class "polr" to model
resids(model)
#Error in eval(predvars, data, env) : object "nose.length" not found
####################################################################
Add new data sets that follow examples 5-6 in Dungang and Zhang (2017).
Seems that the surrogate response values can be used to develop a coefficient of determinations (i.e., R-squared) type statistic!
Package name is taken on CRAN.
Jittering (at least on the response scale) does show some promise for multinomial fits:
library(nnet)
data(df1, package = "sure")
fit <- multinom(y ~ x, data = df1)
# fit <- multinom(y ~ x + I(x ^ 2), data = df1)
y <- as.integer(df1$y)
s <- runif(length(y), min = y, max = y + 1)
prob <- fit$fitted.values
j <- seq_len(ncol(prob)) # j = 1, 2, ..., J
jmat <- matrix(rep(j, times = nrow(prob)), ncol = ncol(prob),
byrow = TRUE) # 1, 2, ..., J; 1, 2, ..., J; ...
es <- rowSums((jmat + 0.5) * prob)
r <- s - es
scatter.smooth(df1$x, r, lpars = list(lwd = 2, col = "red"))
Will need to learn all the properties of these residuals. The residual-by-covariate plots don't look right for the quadratic example.
Something like
getFittedProbs <- function(object) {
UseMethod("getFittedProbs")
}
rms
has the lrm
function for fitting proportional odds ordinal logistic regression models.
Should make use of the naresid
method. Something like:
if(!is.null(object$na.action)) {
res <- naresid(object$na.action, res)
}
Incorporate bulk of R Journal paper as a vignette
Idea is to make it easier for users to construct their own fitted-vs-residual plots. It's capitalizes to it does not interfere with R's built-in generic and the specific packages that have methods defined.
For example:
Fitted.polr <- function(object) {
object$lp - object$zeta[1L] # Xb - a1
}
Should be simple once we write extractors for the fitted probabilities:
$$
R = S - E(S|X) = S - \sum_{i = 1}^J\left(j + 0.5\right)p\left(Y = j | X\right)
$$
For example, plot(fit.polr, which = c(1, 3))
Add support for logistic, probit, loglog, cloglog, and cauchit link functions. Will need to import gumbel distribution functions from MASS
, ordinal
or VGAM
.
Write methods (i.e., "clm"
, "polr"
, and "vglm"
) for broom
s generic augment
function to add residuals to original data.
For example, getSurrogate
should be called within getResiduals
, rather than both duplicating the same code.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.