Comments (3)
Thanks for the feedback - I agree that option 4 seems like the best approach, and we'd be happy to take a PR fixing it. I am surprised that our existing tests in test_shap.py don't catch this (because they do cover CausalForestDML and they run against shap==0.43); please add a test that fails without your fix (or modify at least one of the existing tests to do so).
Also, we'd be happy to allow a higher upper bound (say shap<0.46
) if there are not other breaking changes that are hard to work around; the only reason that we have a specific upper bound is that shap releases have had breaking changes that have broken functionality in the past.
from econml.
@kbattocchi I'm not able to consistently reproduce this error. See #445 for another valiant but ultimately futile effort to reproduce it consistently.
For posterity, here is (essentially) the code I was running when I (occasionally) ran into the error:
Versions:
- Python version: 3.8.17
- econml version: 0.15.0
- numpy version: 1.23.5
- scikit-learn version: 1.3.2
- scipy version: 1.10.1
- shap version: 0.43.0
import econml
import numpy
import sklearn.ensemble as ens
seed = 1337
rng = np.random.default_rng(seed)
n = 1_000
n_x = 2
X = rng.uniform(-1.0, 1.0, size=(n, n_x))
A = rng.binomial(1, 0.5 + 0.5 * X[:, 0])
Y = rng.binomial(1, 0.3 + 0.5 * A + 0.2 * X[:, 0])
X_ = np.hstack([X, A[:, None]])
A_ = rng.binomial(1, 0.5, size=(n,))
model = dml.CausalForestDML(
model_y=ens.RandomForestClassifier(random_state=seed),
model_t=ens.RandomForestClassifier(random_state=seed),
discrete_treatment=True,
discrete_outcome=True,
random_state=seed,
)
model.fit(Y, A_, X=X_)
shap_values = model.shap_values(X_)
Part of the problem with reproducing this error is that EconML doesn't currently allow you to pass a seed to the Explainer
class.
I'd be happy to add in that keyword argument into the shap_values
method; however, that'd involve changing a good number of files (e.g., the LinearCateEstimator
and anything that overwrites that method).
This change isn't strictly necessary for this issue (although I believe it's generally desirable), and, even if I do find a configuration that triggers the error on my machine, I doubt it'll be reproducible across machines.
Any alternative ideas? I can quickly submit a PR with just the change proposed in option 4 and we could revisit adding the seed
keyword argument to shap_values
(if that's desirable).
from econml.
Part of the problem with reproducing this error is that EconML doesn't currently allow you to pass a seed to the
Explainer
class.I'd be happy to add in that keyword argument into the
shap_values
method; however, that'd involve changing a good number of files (e.g., theLinearCateEstimator
and anything that overwrites that method).This change isn't strictly necessary for this issue (although I believe it's generally desirable), and, even if I do find a configuration that triggers the error on my machine, I doubt it'll be reproducible across machines.
Any alternative ideas? I can quickly submit a PR with just the change proposed in option 4 and we could revisit adding the
seed
keyword argument toshap_values
(if that's desirable).
I agree that allowing the seed to be passed as an optional argument seems beneficial, not just for testing but for reproducibility more broadly. If you don't mind adding these changes to your PR, that would be great, but if that's too much work I'm also happy to merge it as-is.
from econml.
Related Issues (20)
- Is a feature engineered from treatment T another treatment to consider for CATE?
- Will DRIV be able to support multiple treatments via multiple instruments?
- DynamicDML() issue: AttributeError: Provided crossfit folds contain training splits that don't contain all treatments DynamicDML HOT 5
- Inconsistent ATE estimation HOT 3
- Confidence Interval for categorical outcome HOT 3
- [Bug] fit_cate_incercept argument in econml.dml.DML does not add intercept correctly HOT 5
- A column-vector y was passed when a 1d array was expected (however, y is already a 1d array) HOT 1
- Individual Treatment Effects HOT 1
- How to get the Confidence Interval for ATE instead of CATE HOT 1
- Converting to Python object not allowed without gil HOT 1
- Reproducible error: SHAP ExplainerError: Additivity check failed in TreeExplainer HOT 4
- Questions regarding DRPolicyForest results HOT 2
- DRtester does not work for binary treatment AND binary outcome HOT 4
- Confounder adjusting before applying the ITE model to observational data
- Calculation of confidence intervals in NormalInferenceResults becomes very slow when passing big dataframes HOT 2
- DML discrete outcome HOT 1
- High memory footprint for big dataframes in CausalForest model HOT 3
- Questions about econml and CausalForestDML
- Reduce residual confounding in time series
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from econml.