Comments (5)
Yes, this is a bug on our end, but using a continuous outcome (even if it's really discrete) should be fine as a workaround.
from econml.
Interestingly, if I remove discrete_treatment = True, and if I put a regressor for model_y (even though y is binary in my case), the code will run; but not sure if the result will be valid, it could be since model_y still estimates the probability y=1, which is the supposed behavior when discrete_treatment = True and with a classifier as model_y. Please let me know if the result would be valid in that case or if you see how to modify the code to make it work with discrete_treatment = True and with a classifier as model_y.
from econml.
Interestingly, if I remove discrete_treatment = True, and if I put a regressor for model_y (even though y is binary in my case), the code will run; but not sure if the result will be valid, it could be since model_y still estimates the probability y=1, which is the supposed behavior when discrete_treatment = True and with a classifier as model_y. Please let me know if the result would be valid in that case or if you see how to modify the code to make it work with discrete_treatment = True and with a classifier as model_y.
Did you mean remove discrete_outcome=True
? DRTester is only designed for discrete treatments, so you should certainly not change the discrete_treatment
argument. It does look like a bug that you can't use discrete_outcome=True
, but I think switching to a regressor should be fine in most cases.
On an unrelated note, I would not use the DML
class directly - if you want a non-parametric final model you should either use NonParamDML
if you want to use an arbitrary final model of your choosing (but this only supports a single treatment and outcome), or use CausalForestDML
if you have an arbitrary number of treatments and outcomes and want confidence intervals (but the final model is limited to being a CausalForest).
from econml.
Yes I meant removing discrete_outcome = True
Do you have the same bug as me when trying to run the code, do we agree that it should work and that something unexpected happen within the library that is not under my control, or should I try to dig more? Thanks!
from econml.
Hi, I wanted to flag that I am also running into an issue when discrete_outcome = True
. My use case is tuning a CausalForestDML object. I can open a separate issue if appropriate, but it seems like the commonality is that discrete_outcome
may not have been persisted everywhere it ought to have been. Thank you for maintaining a useful package!
Example below:
Imports and data generation
import pandas as pd
import numpy as np
from econml.dml import CausalForestDML
from xgboost import XGBClassifier, XGBRegressor
# Number of samples
n = 10000
# Treat half of the samples
treatment = np.repeat([0, 1], n/2)
# Create a covariate that defines heterogeneous treatment effect
covariate = np.resize([0, 1], n)
# Define outcome based on treatment and covariate
# TE is 1 when covariate==1, 0 otherwise
outcome = ((treatment==1) & (covariate==1)).astype(int)
# Store in data frame
df = pd.DataFrame({'treatment': treatment,
'covariate': covariate,
'outcome': outcome})
Demonstrate successful execution of fitting CausalForestDML when discrete_outcome = True, but skipping tuning
# Instantiate
cf_classifier = CausalForestDML(model_y = XGBClassifier(),
model_t = XGBClassifier(),
discrete_outcome = True,
discrete_treatment = True)
# Executes as expected
cf_classifier\
.fit(Y = df['outcome'],
T = df['treatment'],
X = df[['covariate']])
Demonstrate issue with tuning CausalForestDML when discrete_outcome = True
cf_classifier\
.tune(Y = df['outcome'],
T = df['treatment'],
X = df[['covariate']])\
.fit(Y = df['outcome'],
T = df['treatment'],
X = df[['covariate']])
This returns AttributeError: Cannot use a classifier as a first stage model when the target is continuous!
, but the target is a binary integer.
Demonstrate successful execution of tuning CausalForestDML when discrete_outcome = False
cf_regressor = CausalForestDML(model_y = XGBRegressor(),
model_t = XGBClassifier(),
discrete_outcome = False,
discrete_treatment = True)
cf_regressor\
.tune(Y = df['outcome'],
T = df['treatment'],
X = df[['covariate']])\
.fit(Y = df['outcome'],
T = df['treatment'],
X = df[['covariate']])
Similar to the suggestion above, using a regressor for model_y
and passing discrete_outcome = False
during CausalForestDML instantiation allows for successful tuning. Using a tree-based model for the regressor should help keep predictions in the [0, 1] interval for a temporary solution.
from econml.
Related Issues (20)
- Reproducible error: SHAP ExplainerError: Additivity check failed in TreeExplainer HOT 4
- Questions regarding DRPolicyForest results HOT 2
- Confounder adjusting before applying the ITE model to observational data
- Calculation of confidence intervals in NormalInferenceResults becomes very slow when passing big dataframes HOT 2
- DML discrete outcome HOT 1
- High memory footprint for big dataframes in CausalForest model HOT 3
- Questions about econml and CausalForestDML
- Reduce residual confounding in time series
- How can I calculate the treatment effect function in a double machine learning model? HOT 1
- Why Shape of Y in Causal Forest notebook is 1000*1000 HOT 2
- Migrate DeepIV to new TensorFlow API or PyTorch
- Support numpy 2.0 HOT 1
- Support scikit-learn 1.5.0
- Causal Forest DML has very wide confidence interval HOT 1
- V0.15.0 runs hours longer than V0.14.0 HOT 3
- oob_predict_interval: request to add functionality for prediction of out-of-bag confidence intervals
- DeepIV.fit with Inference='bootstrap' throws error
- ModuleNotFoundError: No module named 'econml.dynamic' HOT 1
- Error of crossfit folds splits with DynamicDML
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from econml.