Hi, I wounder how to use cox-time with time dependent co-variates ?<

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

time-dependent co-variates about pycox HOT 11 CLOSED

havakv commented on July 21, 2024

time-dependent co-variates

from pycox.

Comments (11)

havakv commented on July 21, 2024 3

The short answer is that the Cox-Time implementation does not explicitly support time-dependent covariates.
I do, however, believe this can be achieved by partly conditional modeling. The benefit of this approach is that it only requires preprocessing and no changes to the Cox-Time code.

In short, the idea of partly conditional modeling is that every time the covariates of an individual change, you create a new individual and consider the residual time. So for an individual with event time t and a new set of covariates x(s) at time s, you would consider this a new individual with event time t - s.
This means that if the covariates of this individual change k times, you will consider a set of k independent individuals. Your new data set will contain many "copies" of our individuals and you can fit the Cox-Time model to this larger data set. Survival predictions should works as before.

I would recommend including s as a covariate.
Also, you are not restricted to only use the covariates at time s, and can instead create a set of covariates that are representative for the history up till time s.
The WTT-RNN is built on this principle, where the covariate history up till time s is processed by an RNN.

I hope this can help you get started.

from pycox.

hgjlee commented on July 21, 2024 1

Thanks for your explanation in the previous comments. I noticed that time-varying covariates are conventionally formatted in a counting process form with the interval start and end columns in replacement of the event duration column (e.g. lifelines' cox time-varying ph model).

I'm doing an experiment to compare the prediction accuracies from the standard cox time-varying PH model with data in a counting process form and CoxTime and DeepSurv with data formatted according to your suggestion. Would this be a fair comparison? Or are there any implications that I need to consider?

from pycox.

MohdSafwanAhmad commented on July 21, 2024

If the output also has multiple datapoints for each subject than what would be the best one to consider while making the prediction?

from pycox.

havakv commented on July 21, 2024

When you say the output has multiple datapoints, what are you referring to? Are the multiple datapoints describing the survival function, such as for LogisticHazard and DeepHit. Or are the multiple datapoints containing some other information? If you are referring to the output of e.g., LogisticHazard, the same approach as described above should be fine. An example of this is the WTT-RNN which has two outputs (alpha and beta) describing the survival function.

from pycox.

MohdSafwanAhmad commented on July 21, 2024

Sorry I meant the test data. Just like the training data, if there are multiple data points for each subject for different times (time-dependent covariates), we will get the survival probability for each data point separately. How do we read the survival probabilities in that case since one subject will have several S(t|x) vs time plots? The NASA turbofan dataset developed for RUL prediction can be taken as an example (assuming some binary values for the event column).

from pycox.

havakv commented on July 21, 2024

Ah, I think I understand. So, again specifying that I have not tried the approach, prediction on the test data should be straight forward. Evaluation of the predictions, on the other hand, might be harder.

As all time dependency is captured by the covariates up to a given time s, i.e, x(s), your predictions are conditioned on this time and you can treat this as the starting point (0) of your survival predictions. In other words, your survival predictions are S(t | x(s), t > s), so S(s | x(s), t > s) = 1. I realise that the notation her might be confusing though. It is probably better explained by the WTT-RNN blog or the WTT-RNN masters theis.

For evaluation of the predictions the problem is that the multiple survival predictions S(t | x(s), t > s) for different s are highly correlated, so considering them independent might be problematic. I don't know what the best approach for this evaluation would be.

from pycox.

havakv commented on July 21, 2024

@hgjlee I think it is very interesting that you are conduction these kinds of experiments, and I hope you will share your results with us in the future!

At the top of my head, the only problematic part of comparing time-varying Cox PH with CoxTime and DeepSurv is already listed at the bottom of your lifelines link short-note-on-prediciton.
I.e., as time-varying Cox PH is not really intended for prediction, you would need to simulate the time-dependent covariate process to be able to do predictions without cheating (using covariates from the future).

from pycox.

hgjlee commented on July 21, 2024

@havakv Thank you for your reply. I'd love to share my results when they're ready. In that case, it'd make more sense to apply partly conditioning to all the models and then compare the results.

from pycox.

Niccolo-Ajroldi commented on July 21, 2024

I'm doing an experiment to compare the prediction accuracies from the standard cox time-varying PH model with data in a counting process form and CoxTime and DeepSurv with data formatted according to your suggestion. Would this be a fair comparison? Or are there any implications that I need to consider?

Hi Jacob, can I ask you how you finally managed the issue? Were you able to obtain meaningful results?
Thanks!

from pycox.

ymao418 commented on July 21, 2024

@hgjlee any updates on the results comparison? I am working on a problem with a covariate: transaction in the last 30 days. Obviously, this covariate will change over time and wonder if not construct it as a time-varying covariate will affect predictions much or not.

from pycox.

hxian commented on July 21, 2024

@havakv

I would recommend including s as a covariate.

when you said using s as a covariate, did you mean using it as a numeric covariate, thus needing to standardize it, or just putting it together with the other binary variables that don't need to transformed?

from pycox.

time-dependent co-variates about pycox HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent