Hi! I was just wondering if it'd be possible to use the rsample package with <a href="

The example I'm talking about can be found here: <a href="https://rsample.tidymodels.o

Combination of rsample with Amelia for missing values about rsample HOT 6 CLOSED

tidymodels commented on September 1, 2024

Combination of rsample with Amelia for missing values

from rsample.

Comments (6)

juliasilge commented on September 1, 2024 1

Thanks so much for your discussion! 🙌 I'm cleaning up older issues. Currently tidymodels handles imputation in the recipes package; check out recipe steps for imputation here.

from rsample.

topepo commented on September 1, 2024

That is an interesting prospect; can we have a tidy implementation of multiple imputation methods. I don't know/think that it belongs in rsample mostly because I want to keep the scope of the package small and focused.

I think that it might be good to have a conversion function (or maybe a tidy method) that can take an imputation object and make it workable with purrr::map and other tidyverse components. Before I let Pfizer, we had a non-trivial data analysis workflow for a clinical trial that required more than a simple function call (say to lm) to do the analysis and we wrestled with how to do the MI with existing packages. A tidy approach would enable those types of analysis.

At first glance, it looks like Amelia (and mice and others) couple the imputation and analysis. While it gives you a simple api, it does make things difficult if you want to control or modify the process. Perhaps they are decoupled in worker functions in those package. I don't know enough about them. Perhaps the package authors would be interested in tidy approaches.

I have some technical thoughts I could offer based on what I've learned in rsample. Though.

(I must confess that I haven't done any multiple imputation methods (for inferential analysis) since graduate school; I'm usually worried about prediction so a single imputation usually how that's done.)

Now that I've written this, I realize that I'm rambling. What do you think?

from rsample.

jroberayalas commented on September 1, 2024

Thank you very much for your reply. I find quite interesting the different ideas that you have. Currently, I'm comparing different indicators of cumulative blood pressure (BP) exposure based on historical BP measures to assess whether it is possible to improve the performance of CVD predictive (Cox) models as those based on commonly used models. So far, I'm mostly following your examples using the recipes and rsample packages for survival analysis, since this seems a nice way to assess the importance of the cumulative BP indicators. However, the dataset I was using has some lipid variables (cholesterol, HDL, LDL,...) with a high level of missingness (around 70%), so that was the reason I was asking about the possibility to merge Amelia with rsample as both of them seem to share a lot of features. Nevertheless, I opted to simply omit the lipid variables mainly because 70% missingness is too much and I do not think the models can benefit from them at all. Your examples with recipes and survival analysis are more appropriate with what I'm working on.

I do agree that a tidy approach with MI packages may be quite useful, since a lot of health research (at least here in Oxford) seems to use it a lot to overcome the uncertainty with missing values.

from rsample.

zq2323 commented on September 1, 2024

Thank you very much for your reply. I find quite interesting the different ideas that you have. Currently, I'm comparing different indicators of cumulative blood pressure (BP) exposure based on historical BP measures to assess whether it is possible to improve the performance of CVD predictive (Cox) models as those based on commonly used models. So far, I'm mostly following your examples using the recipes and rsample packages for survival analysis, since this seems a nice way to assess the importance of the cumulative BP indicators. However, the dataset I was using has some lipid variables (cholesterol, HDL, LDL,...) with a high level of missingness (around 70%), so that was the reason I was asking about the possibility to merge Amelia with rsample as both of them seem to share a lot of features. Nevertheless, I opted to simply omit the lipid variables mainly because 70% missingness is too much and I do not think the models can benefit from them at all. Your examples with recipes and survival analysis are more appropriate with what I'm working on.

I do agree that a tidy approach with MI packages may be quite useful, since a lot of health research (at least here in Oxford) seems to use it a lot to overcome the uncertainty with missing values.

Thanks a lot for your discussion! I' m so interested in the "examples with recipes and survival analysis" you mentioned in this reply. But I can't find any link or resource of the examples. Would you mind to share the example? I know that this example may not be found due to too long time.

from rsample.

jroberayalas commented on September 1, 2024

The example I'm talking about can be found here: https://rsample.tidymodels.org/articles/Applications/Survival_Analysis.html

from rsample.

github-actions commented on September 1, 2024

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

from rsample.

Combination of rsample with Amelia for missing values about rsample HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent