darya-chyzhyk / confound_prediction Goto Github PK
View Code? Open in Web Editor NEWConfound-isolating cross-validation approach to control for a confounding effect in a predictive model.
License: BSD 3-Clause "New" or "Revised" License
Confound-isolating cross-validation approach to control for a confounding effect in a predictive model.
License: BSD 3-Clause "New" or "Revised" License
Make confound_isolating_cv compatible with Scikit Learn cross validation
Hi Darya Chyzhyk,
Is it possible to use your package to control for categorical variables?
(I have 10-20 categories)...
If so, any tips on how to do so?
Thanks!
print doc at the beginning of the example
I would show the samples in a scatter plot to make it clear that you are sampling points from a dataset.
Less iterations and a smaller dataset to reduce computation time...
"Confound_prediction is a Python module to control confound effect in the prediction or classification model." -> "Confound_prediction is a Python module to control confound effect in prediction or classification models."
"Developed framework is based" -> "The developed framework is based"
"You provide us" -> "You provide"
"We return you" -> "The function returns"
Hello,
Thank you very much for tackling this issue of confounders, which seems very recurrent in clinical ML problems.
I have some questions about the project/paper:
and
a Deconfounded test set (with no data leakage of course)?k
multiple confoundersis now:
The quantity
can still be estimated with kernel density estimation.
I made some quick toy examples, it seems to approximately work on simple additive toy examples and when the number of example is sufficient:
For instance with 1000 sample and 10 confounding factors i got:
For instance with 100 sample and 3 confounding factors i got:
It would be also interesting to study the required N
to be sure at a certain level the deconfounding capability for k
factors considering the type of link.
Do you think this is a correct approach and generalization?
Thank you
Best regards
I'd rather put the simulated_data in the library, e.g. in a '_utils.py' module, otherwise, people will not understand what this stands for.
I find it very important that examples are visual.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.