Currently, the proposal is to look at apathy status over time via a logistic regression (where 'over time' means that temporally varying measurement-level variables are included in the model, and the prediction is per session), and testing via LOOIC.
Two other approaches we could take are:
- Something similar to the analysis in Kyla's paper, where the aim is to assess whether there is evidence that different variables can predict whether a patient has developed apathy after a time span of
∆t
years.
- An even more flexible version of the above, where the aim is simply to use e.g. a machine learning approach, and just train a classifier to do that (so extending beyond glms).
For the former, the simpler model is (arguably) a bit more intuitive: predicting what happens to a patient given their current status and age / cognitive score / etc. However, it also feels like the models are closely related: for a given time since diagnosis (t
) our model would give p(apathetic | t, ...)
, and we could get something like Kyla's metric via e.g. p(developing apathy | ∆t, ...) = (p(apathetic | t + ∆t, ...) - p(apathetic | t, ...)) / p(!apathetic | t, ...)
(this is an oversimplification: assuming for the sake of argument a positive β
on t
, and that other metrics get worse over time, so that remission is negligible). In other words, the 'development' model can be thought of as a (normalised) slice through the full model at a specific timepoint. In that context, we would then talk about the e.g. interactions between subject-level variables and years since diagnosis as being the 'risk factors' for developing apathy.
- Does that interpretation of the full model make sense? Or is it better to think about these as fundamentally different approaches?
- Are we likely to be losing power by trying to fit a larger model? Fitting interaction terms is potentially a harder problem, though we do have more data available for the full formulation. Similarly, the development model makes a subtly different set of assumptions about e.g. patients that develop apathy but then go into remission.
- One obvious subtlety in the full model is in the way we evaluate
p(apathetic | t + ∆t, ...)
, in the presence of other measurement-level variables: it is really something more like p(apathetic | t + ∆t, motor_scores(t + ∆t), ...)
. That then has a non-trivial dependence on changes in other metrics, in a way that means it's not such a pure predictor (we could presumably marginalise over future unseen motor scores etc., but that is introducing more complexity).
For the classification approach, we're basically trading off interpretability for flexibility (if we went for say a GP / kernel regression / kernel SVM / etc approach). Are there any obvious disadvantages / is it redundant to have a look at that approach (this would be more as a potential side project, and wouldn't change the core analysis).
Thanks!!
@cleheron @zenourn @m-macaskill