Giter Club home page Giter Club logo

adaptsim's Introduction

AdaptSim

This is a prototype R package for running "adaptive simulations". The package defines the R6 Adapt class, which takes data simulation and model fitting functions as arguments. It then has two main functions, one for running a block of simulations, and another for running an adaptive step to estimate a model and sample new parameter values for the next block of simulations. Note, the documentation is not written yet for this package.

Example: Type I error with a linear mixed model

We provide an example of estimating the type I error for a linear mixed model for a cluster randomised trial, varying the number of clusters and the intraclass correlation coefficient. We aim to replicate some of the results in Leyrat et al (2018).

Data simulation and model fitting

First we define a function that simulates data sets, which takes as an argument a vector with the values of the iteration of the simulation. We use the glmmrBase package for data simulation of GLMMs:

data_sim <- function(x){
  df <- nelder(as.formula(paste0("~cl(",x[1],") > ind(10)")))
  df$int <- 0
  df[df$cl > x[1]/2,'int'] <- 1

  mod <- Model$new(
    covariance = list(
      formula = ~(1|gr(cl)),

      parameters = c(sqrt(x[2]))
    ),
    mean = list(
      formula = ~ int,
      parameters = rep(0,2) # change the parameters as required
    ),
    data = df,
    family = gaussian(),
    verbose = FALSE,
    var_par = sqrt(1-x[2])
  )
  data <- mod$sim_data(type = "data")
  return(data)
}

then we define a function that estimates the model of interest and retruns a named vector of the statistics of interest. We use lme4 for model fitting:

fit_model <- function(data=data){
  fit1 <- lme4::lmer(y~int+(1|cl),data=data)#tryCatch(suppressWarnings(lme4::lmer(y~int+(1|cl),data=data)),error=function(i)NA)

  if(is(fit1,"lmerMod")){
    s1 <- summary(fit1)
    tstat <- s1$coefficients[2,3]
    pval <- 2*(1-pnorm(abs(tstat)))

    return(c(conv = fit1@optinfo$conv$opt,
             beta = s1$coefficients[2,1],
             se = s1$coefficients[2,2],
             p = pval,
             p1 = I(pval < 0.05)*1,
             cover = I((s1$coefficients[2,1] + qnorm(0.975)*s1$coefficients[2,2]) > 0 &&
                         (s1$coefficients[2,1] - qnorm(0.975)*s1$coefficients[2,2]) < 0)*1))
  } else {
  # return NA if the model fails to fit
    return(c(conv = 1,
             beta = NA,
             se = NA,
             p = NA,
             p1 = NA,
             cover = NA))
  }
}

Create a new Adapt class object

Creation of a new Adapt class object is relatively simple. We provide the names of the two functions defined above, the upper and lower limits of the ranges of the varying parameters in the simulation (here, the number of clusters and the ICC), the inital number of simulations (we choose 10,000), and the number of basis functions for the adaptive step:

mod <- adapt$new(data_fn = "data_sim",
                 fit_fn = "fit_model",
                 par_lower = c(4,0),
                 par_upper = c(40,0.1),
                 par_discrete = c(TRUE,FALSE),
                 n = 10000,
                 m = c(5,10))

We also indicate that the first parameter (the number of clusters) is discrete, which forces the class to round parameter values to the nearest integer. We set the priors for the adaptive Bayesian model (although, the defaults usually work well):

mod$set_priors(lengthscale = matrix(c(0,0,0.5,0.5),ncol=2),
               fvar = c(0,0.5))

Run the simulations

We can run the simulations in parallel by providing a cluster to the simulation function. We must export the relevant libraries to the cluster as well when creating it. To create the cluster:

cl <- parallel::makeCluster(parallel::detectCores()-1)
parallel::clusterEvalQ(cl,library(glmmrBase))
parallel::clusterEvalQ(cl,library(lme4))

Then, to run the simualtions we simply call:

mod$sim(cl=cl)

The results are stored in the object and can be accessed if needed using mod$sim_output.

Run the adaptive step

We aim to estimate the type I error. So, our model is a binomial model where the outcome is whether the p-value was above or below the threshold 0.05 here, which is returned in the fit_model function as p1. We also need to indicate how many new simulation parameter values to draw, we choose 10,000 again. By default, the function will generate a lattice over the domain of the simulation parameters at which to predict the value of the statistic of interest. Then new values are drawn by sampling uniformly within these cells, with the number of draws proportional to the adaptive statistic. If we instead want to specify a lattice so that predictions are made at desirable values, then we can provide these values to the function. To replicate the results of Leyrat et al, we want to extract predictions at an ICC of 0.001 and 0.05 and for all values of the number of clusters. So we first define the data frame samp_df below, which is passed to the adaptive function in this and subsequent steps:

samp_df <- expand.grid(n = 4:40, icc = c(seq(0.001,0.05,length.out=8),seq(0.05,0.1,length.out=8)))
mod$adapt(stat="p1",
          n_new = 10000,
          kappa = 1,
          model="binomial",
          samp_n = samp_df,
          sampling = 150,
          L=3.2)

We can then repeat the procedure by calling mod$sim and mod$adapt until we reach our goal.

Extracting predictions

We can extract the predictions after each adaptive step as they are stored as mod$adapt_output. We have generated plots to compare against Leyrat et al using the following ggplot2 command

ggplot(data=mod$adapt_output[mod$adapt_output$Var2%in%c(0.001,0.05),],
       aes(x=Var1,y=mean))+
  geom_hline(yintercept=0.05,lty=1)+
  geom_hline(yintercept = max(0,0.05-mod2$kappa), lty=3)+
  geom_hline(yintercept = min(1,0.05+mod2$kappa), lty=3)+
  geom_ribbon(aes(ymin=lci,ymax=uci),fill="#268bd2",alpha=0.2)+
  geom_line(color="#dc322f")+
  facet_wrap(~factor(Var2))+
  ggthemes::theme_solarized_2()+
  labs(x="Number of clusters",y="Type I error")

adaptsim's People

Contributors

samuel-watson avatar samueliwatson avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.