The alda from mccarthy-m-g

Michael McCarthy

Hi, I’m Michael, an R programmer and data scientist. My background is in psychology research, where I’ve studied topics ranging from attitude formation, to decision-making, to individual differences in brain function. Studying how people think and act has given me a deep appreciation for the human elements of data science. Much of our data is about humans. All of our data is communicated to humans. Understanding these relationships has been key to successfully navigating my role as a data scientist.

When I’m not pushing new commits to GitHub you can find me writing music in my studio, riding my bike on the local trails, or spending time with my friends and family.

Repository Overview

Community Contributions

Sites

Contact

Consulting

Simulate data for chapter 3

Consider simulating data for chapter 3, instead of just leaving it out entirely.

Code for specifying nonlinear mixed models in Chapter 6

Problem

I need help specifying the mixed logistic growth models for the alda::cognitive_growth experiment covered in Chapter 6, section 6.4, with nlme::nlme() and/or lme4::nlmer(). I'm not sure if I'm getting tripped up on the syntax, the math, or both.

Singer and Willet describe Model A and Model B as follows.

Model A

We adopt the following logistic level-1 function as the hypothesized individual change trajectory for the alda::cognitive_growth experiment:

$$Y_{ij} = 1 + \frac{19}{1 + \Pi_{0i} e^{-(\Pi_{1i} t_{ij})}} + \epsilon_{ij}, \tag{6.8}$$

where $Y_{ij}$ represents the number of moves child $i$ makes (nmoves) prior to a fatal error in game $j$ (game); $\Pi_{0i}$ and $\Pi_{1i}$ are individual growth parameters related to the "intercept" and "slope", respectively; and $\epsilon_{ij} \sim N(0, \sigma^2_\epsilon)$. This model invokes two constraints: (1) all children have identical lower and upper asymptotes; and (2) these asymptotes are set to specific values, 1 and 20, which are the minimum and maximum values of nmoves.

And the following linear level-2 model for variation in the individual growth parameters across children:

$$\begin{align} \Pi_{0i} = \gamma_{00} + \zeta_{0i} \\\ \Pi_{1i} = \gamma_{10} + \zeta_{1i} \end{align} \tag{6.9a}$$

where

$$\begin{bmatrix} \zeta_{0i} \\\ \zeta_{1i} \end{bmatrix} \sim N\begin{pmatrix} \begin{bmatrix} 0\\\ 0 \end{bmatrix}, \begin{bmatrix} \sigma^2_0 & \sigma_{10} \\\ \sigma_{10} & \sigma^2_1 \end{bmatrix} \end{pmatrix}. \tag{6.9b}$$

This model stipulates that the level-1 logistic individual growth parameters, $\Pi_{0i}$ and $\Pi_{1i}$, differ across children around unknown population average values, $\gamma_{00}$ and $\gamma_{10}$, with unknown population residual variances $\zeta_{0i}$ and $\zeta_{1i}$, and covariance $\sigma_{10}$.

The table below presents the results of fitting this model to the alda::cognitive_growth data that are reported in the textbook (Table 6.6).

$\gamma_{00}$	$\gamma_{10}$	$\sigma^2_0$	$\sigma^2_1$	$\sigma^2_\epsilon$	$\sigma_{01}$
12.9551	0.1227	0.6761	0.0072	13.4005	−0.0586

In Figure 6.10 they present a prototypical trajectory for the average child using predictions from this model.

Model B

We next ask whether we can predict variation in the individual growth parameters. We illustrate this process by asking whether the level-1 individual growth parameters, $\Pi_{0i}$ and $\Pi_{1i}$, differ by the children’s reading_score on a standardized reading test.

Model B postulates the following level-2 submodel:

$$\begin{align} \Pi_{0i} &= \gamma_{00} + \gamma_{01}(\mathrm{READ}_i - \overline{\mathrm{READ}}) + \zeta_{0i} \\\ \Pi_{1i} &= \gamma_{10} + \gamma_{11}(\mathrm{READ}_i - \overline{\mathrm{READ}}) + \zeta_{1i}, \end{align} \tag{6.10a}$$

where we: (1) center reading_score ($\mathrm{READ}$) on the sample mean to preserve the comparability of level-2 parameters across models; and (2) invoke the same assumptions about the level-2 residuals articulated in equation 6.9b.

The table below presents the results of fitting this model to the alda::cognitive_growth data that are reported in the textbook (Table 6.6).

$\gamma_{00}$	$\gamma_{01}$	$\gamma_{10}$	$\gamma_{11}$	$\sigma^2_0$	$\sigma^2_1$	$\sigma^2_\epsilon$	$\sigma_{01}$
12.8840	-0.3745	0.1223	0.0405	0.5610	0.0060	13.4165	-0.0469

In Figure 6.10 they present fitted trajectories for two prototypical children with reading scores two standard deviations above and below the sample mean using predictions from this model.

Additional context

There is an example website for the book showing how to fit the example models in various programs. For the nonlinear mixed models above, the person who wrote the examples notes that:

This model does not correspond to equation (6.8) in the book. Instead, it corresponds to the following equation:
$$Y_{ij} = 1 + \frac{19}{1 + \Pi_{0} * exp(- (\Pi_{1} + u_1)t - u_0)} + \epsilon_{ij}$$

The models in the book were fit in SAS, and for the SAS code examples the estimates on the website line up with what's reported in the book (search "Table 6.6 on page 231" on the page to jump to the relevant section).

However, this model isn't explained further on the website, and it isn't obvious to me where it came from, since it isn't in the book's errata either. I don't believe this is equivalent to the textbook equations.

Based on the SAS code for Model A, the "u" terms are the unknown population residual variances $\zeta_{0i}$ and $\zeta_{1i}$; the "s" terms are the random effects variances $\sigma^2_0$, $\sigma^2_1$, $\sigma^2_\epsilon$; and the "c" term is the covariance $\sigma_{10}$:

proc nlmixed data=foxngeese1 maxiter=1000;
  title2 'Model A: Unconditional logistic growth trajectory';
  parms G00=12 G10=.1 s2e=20 s2u0=.3 cu10=0 s2u1=.01;
  num = 19;
  den = 1 + G00*(exp(-(G10*GAME +u0 +u1*GAME)));
  model NMOVES ~ normal(1+num/den,s2e);
  random u0 u1 ~ normal([0,0],[s2u0,cu10,s2u1]) subject=id;
run;

The main thing I don't understand is how the u_0 term, which I believe corresponds to $\sigma^2_0$, ended up inside the exp() function instead of being added to $\Pi_{0}$.

For Model B they provide the following SAS code:

data foxngeese1;
  set 'c:\alda\foxngeese_pp';
  read=read-1.95625;
  readgame=read*game;
run;

proc nlmixed data=foxngeese1 maxiter=1000;
  title2 'Model B: Effect of READing scores on 'intercept' and 'slope'';
  parms G00=12 G10=.12 G01=-.4 G11=.04 s2e=14 s2u0=.5 cu10=0 s2u1=.01;
  num = 19;
  den = 1 + G00*(exp(-(G01*READ + G10*GAME +G11*READGAME +u0 +u1*GAME)));
  model NMOVES ~ normal(1+num/den,s2e);
  random u0 u1 ~ normal([0,0],[s2u0,cu10,s2u1]) subject=id;
run;

This makes some sense if you apply substitution rules for $\Pi_{1}$ in the website equation (where the centred reading scores are $x$):

$$Y_{ij} = 1 + \frac{19}{1 + \gamma_{00} * e^{-((\gamma_{10} + \gamma_{11}x + u_1)t - \gamma_{01}x - u_0)}} + \epsilon_{ij},$$

but it still isn't clear to me why the u_0 term is where it is, or why the $\gamma_{01}x$ term is there in the same place now too. My intuition is that these should be added to $\gamma_{00}$ in the equation after applying substitution, so they shouldn't end up inside the exp() function.

Vignettes for chapter examples

Consider making vignettes (or pkgdown articles, depending on the size of the package) for the examples demonstrated in each chapter. Each chapter would get its own vignette.

mccarthy-m-g / alda Goto Github PK

alda's Introduction

Michael McCarthy

Repository Overview

Community Contributions

Sites

Contact

alda's People

Contributors

Watchers

alda's Issues

Simulate data for chapter 3

Code for specifying nonlinear mixed models in Chapter 6

Problem

Model A

Model B

Additional context

Vignettes for chapter examples

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent