clinicalml / cfrnet Goto Github PK

View Code? Open in Web Editor NEW

282.0 282.0 80.0 3.32 MB

Counterfactual Regression

License: MIT License

Python 99.74% Shell 0.26%

cfrnet's People

Contributors

Stargazers

Watchers

Forkers

yalechang siyuanzhao tmbtw wgmueller1 codeaudit fxiao7 thunder112358 ahoyosid negar-hassanpour caoao afcarl boston123456 hammer-wang jenn-ying stockedge yangqiu aakashsrinivasan03 mehrab2603 commanders2005 lnhieuvn daehwanahn sunny1205124 ankits0207 yleng joshfeldman95 mac-kim ginward yongsubaek zengshx777 fedorajzf zhengxuyu arghyadatta zjph602xtc shubhampachori12110095 sandy4321 abraich rmoraffa cognoscentai zehsilva sailendramishra tor4z jemolano1498 jimmy-inl ouyang-zhicheng xiangnanyue superwood awinnie oddrose fredjoha pawlikowskab vishalbelsare yuyihe-1216 zzlin1003 iron13 spencerwillant siyunyang tiantiancoder muskang48 valeman sunyangzju mili-yini cyprien-ecpkn bgielen mariah0711 cryptowealth-community fishmasterhm keithyin syyunn abigailhust anhngv102 zjwfno1 rogermonkey jerry-jwz aviraldawar flora-jia-jfr yhjflower lwy-thu hitafryy

cfrnet's Issues

weighting treated / controlled for reweight_sample option.

Thank you for sharing the code. I am reading through the code to understand how it works.
I am wondering whether there exists a bug in implementing "reweight_sample" option.

        if FLAGS.reweight_sample:
            w_t = t / (2 * p_t)
            w_c = (1 - t) / (2 * 1 - p_t)
            sample_weight = w_t + w_c
        else:
            sample_weight = 1.0

It looks like p_t is the proportion of treated instances -- so if it is 0.5, w_t and w_c would yield the same weight for the treated & controlled instances. If this is true, I think

cfrnet/cfr/cfr_net.py

Line 139 in 0377b0c

w_c = (1-t)/(2*1-p_t)

should be w_c = (1 - t) / (2 * (1 - p_t))

Reported results

Hi,

I ran the simulation using the provided configuration and observed results that appear to differ from the results reported in your paper. Despite my efforts to investigate the cause of this discrepancy, I have been unable to determine why these results differ.

I have used the following configuration:

p_alpha=[0.3]
p_lambda=[1e-4]
n_in=[3]
n_out=[3]
nonlin='elu'
lrate=[1e-3]
batch_size=[100]
dim_in=[200]
dim_out=[100]
batch_norm=[0]
normalization=['divide']
imb_fun=['wass']
experiments=1000
reweight_sample=[1]
split_output=[1]
rep_weight_decay=[0]
dropout_in=1.0
dropout_out=1.0
rbf_sigma=0.1
lrate_decay=0.97
decay=0.3
optimizer='Adam'
wass_lambda=10.0
wass_iterations=10
wass_bpt=1
use_p_correction=0
iterations=3000
weight_init=[0.1]
outdir='results/example_ihdp'
datadir='data/'
dataform='ihdp_npci_1-1000.train.npz'
data_test='ihdp_npci_1-1000.test.npz'
pred_output_delay=200
loss='l2'
sparse=0
varsel=0
repetitions=1
val_part=0.3

and I got the following results:

Mode: Train
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

0 | 0.785 +/- 0.052 | 0.219 +/- 0.031 | 1.135 +/- 0.045 | 1.188 +/- 0.020 | 1.164 +/- 0.021 | 0.994 +/- 0.036 | 4.895 +/- 0.613

Mode: Valid
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

0 | 0.884 +/- 0.084 | 0.224 +/- 0.031 | 1.363 +/- 0.091 | 1.206 +/- 0.032 | 1.201 +/- 0.032 | 1.995 +/- 0.402 | 5.661 +/- 0.799

Mode: Test
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Pehe_nn

0 | 0.833 +/- 0.056 | 0.236 +/- 0.032 | 1.287 +/- 0.065 | 1.182 +/- 0.024 | 1.183 +/- 0.022 | 5.456 +/- 0.682

Thank you for your attention to this matter, and I look forward to your response.

There are not 747 observations in ihdp_npci_1-1000.train.npz

Hi, thanks very much for your code and paper. I find there are 672 observations(125 treatment, 547 control) in ihdp_npci_1-1000.train/test.npz. But the paper says 747 observations(139 treatment, 608 control).

usage of cfr_net_train.py

Hi, I'd like to use cfrnet code to run counterfactual analysis. Shall we use cfr_net_train.py to train a model? If so, then what's the usage to call this script? I tried the default setting, but it failed with errors to find files. Thanks for your answer.

Jobs code

Can you share the jobs example?

'res' in cfr_net.py line 149 is not defined

'res' in cfr_net.py line 149 is not defined.

It should be
res = tf.abs(y - y_)

or just use 'risk' equivalently.

errors in the examples

Hi,
I am running a linux mint machine in a virtual enviroment.

(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~/cfrnet $ cd ~
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ pip install --ignore-installed --upgrade $TF_BINARY_URL
Collecting tensorflow==0.10.0 from https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
Using cached https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
Collecting six>=1.10.0 (from tensorflow==0.10.0)
Using cached six-1.10.0-py2.py3-none-any.whl
Collecting numpy>=1.8.2 (from tensorflow==0.10.0)
Using cached numpy-1.12.1-cp27-cp27mu-manylinux1_x86_64.whl
Collecting mock>=2.0.0 (from tensorflow==0.10.0)
Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting wheel (from tensorflow==0.10.0)
Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB)
100% |████████████████████████████████| 71kB 1.9MB/s
Collecting protobuf==3.0.0b2 (from tensorflow==0.10.0)
Using cached protobuf-3.0.0b2-py2.py3-none-any.whl
Collecting funcsigs>=1; python_version < "3.3" (from mock>=2.0.0->tensorflow==0.10.0)
Downloading funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock>=2.0.0->tensorflow==0.10.0)
Using cached pbr-3.0.0-py2.py3-none-any.whl
Collecting setuptools (from protobuf==3.0.0b2->tensorflow==0.10.0)
Using cached setuptools-35.0.2-py2.py3-none-any.whl
Collecting packaging>=16.8 (from setuptools->protobuf==3.0.0b2->tensorflow==0.10.0)
Using cached packaging-16.8-py2.py3-none-any.whl
Collecting appdirs>=1.4.0 (from setuptools->protobuf==3.0.0b2->tensorflow==0.10.0)
Using cached appdirs-1.4.3-py2.py3-none-any.whl
Collecting pyparsing (from packaging>=16.8->setuptools->protobuf==3.0.0b2->tensorflow==0.10.0)
Using cached pyparsing-2.2.0-py2.py3-none-any.whl
Installing collected packages: six, numpy, funcsigs, pbr, mock, wheel, pyparsing, packaging, appdirs, setuptools, protobuf, tensorflow
Successfully installed appdirs-1.4.3 funcsigs-1.0.2 mock-2.0.0 numpy-1.12.1 packaging-16.8 pbr-3.0.0 protobuf-3.0.0b2 pyparsing-2.2.0 setuptools-35.0.2 six-1.10.0 tensorflow-0.10.0 wheel-0.29.0
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ git clone https://github.com/clinicalml/cfrnet.git
fatal: destination path 'cfrnet' already exists and is not an empty directory.
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ git clone https://github.com/clinicalml/cfrnet.git
Cloning into 'cfrnet'...
remote: Counting objects: 100, done.
remote: Total 100 (delta 0), reused 0 (delta 0), pack-reused 100
Receiving objects: 100% (100/100), 3.34 MiB | 596.00 KiB/s, done.
Resolving deltas: 100% (42/42), done.
Checking connectivity... done.
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ ./example_ihdp.sh
bash: ./example_ihdp.sh: No such file or directory
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~ $ cd cfrnet/
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~/cfrnet $ ./example_ihdp.sh

Run 1 of 20:

p_alpha: 0
Training with hyperparameters: alpha=0, lambda=0.0001
Training data: data/ihdp_npci_1-100.train.npz
Test data: data/ihdp_npci_1-100.test.npz
Loaded data with shape [672,25]
Defining graph...

Traceback (most recent call last):
File "cfr_net_train.py", line 428, in
tf.app.run()
File "/home/andrewcz/miniconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "cfr_net_train.py", line 421, in main
run(outdir)
File "cfr_net_train.py", line 277, in run
CFR = cfr.cfr_net(x, t, y_, p, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/andrewcz/cfrnet/cfr/cfr_net.py", line 25, in init
self.build_graph(x, t, y , p_t, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/andrewcz/cfrnet/cfr/cfr_net.py", line 129, in _build_graph
h_rep_norm = h_rep / safe_sqrt(tf.reduce_sum(tf.square(h_rep), axis=1, keep_dims=True))
TypeError: reduce_sum() got an unexpected keyword argument 'axis'
Configuration used, skipping
Configuration used, skipping

Run 4 of 20:

p_alpha: 1
Training with hyperparameters: alpha=1, lambda=0.0001
Training data: data/ihdp_npci_1-100.train.npz
Test data: data/ihdp_npci_1-100.test.npz
Loaded data with shape [672,25]
Defining graph...

Traceback (most recent call last):
File "cfr_net_train.py", line 428, in
tf.app.run()
File "/home/andrewcz/miniconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "cfr_net_train.py", line 421, in main
run(outdir)
File "cfr_net_train.py", line 277, in run
CFR = cfr.cfr_net(x, t, y_, p, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/andrewcz/cfrnet/cfr/cfr_net.py", line 25, in init
self.build_graph(x, t, y , p_t, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/andrewcz/cfrnet/cfr/cfr_net.py", line 129, in _build_graph
h_rep_norm = h_rep / safe_sqrt(tf.reduce_sum(tf.square(h_rep), axis=1, keep_dims=True))
TypeError: reduce_sum() got an unexpected keyword argument 'axis'
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping
Configuration used, skipping

Evaluating experiment results/example_ihdp...
Loading results from results/example_ihdp...
Found 0 experiment configurations.
Traceback (most recent call last):
File "evaluate.py", line 97, in
evaluate(config_file, overwrite, filters=filters)
File "evaluate.py", line 59, in evaluate
binary=binary)
File "/home/andrewcz/cfrnet/cfr/evaluation.py", line 332, in evaluate
raise Exception('No finished results found.')
Exception: No finished results found.
(python27) andrewcz@andrewcz-PORTEGE-Z30t-B ~/cfrnet $

Im not sure whats going on . all the errors come when i try and run the shell example.
Many thanks,
Andrew

Error when using wass

Thanks for sharing your code and excellent work!

I try to run your code on other experimental data.
However, I run into an error when using wass. The code works when other imb_funs (mmd2_rbf) are used.

Attached is the error log. Do you have any idea about what might cause this error so that I could fix it?

Traceback (most recent call last):
File "cfr_net_train.py", line 427, in main
run(outdir)
File "cfr_net_train.py", line 374, in run
D_exp_test, logfile, i_exp)
File "cfr_net_train.py", line 142, in train
CFR.r_alpha: FLAGS.p_alpha, CFR.r_lambda: FLAGS.p_lambda, CFR.p_t: p_treated})
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [0,10] vs. shape[1] = [1,1]
[[Node: concat_2 = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](concat_2/concat_dim, concat_1, concat)]]
[[Node: gradients/Cast_2/_111 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_588_gradients/Cast_2", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op u'concat_2', defined at:
File "cfr_net_train.py", line 434, in
tf.app.run()
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "cfr_net_train.py", line 427, in main
run(outdir)
File "cfr_net_train.py", line 283, in run
CFR = cfr.cfr_net(x, t, y_, p, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/siyuan/git/cfrnet/cfr/cfr_net.py", line 25, in init
self.build_graph(x, t, y , p_t, FLAGS, r_alpha, r_lambda, do_in, do_out, dims)
File "/home/siyuan/git/cfrnet/cfr/cfr_net.py", line 185, in _build_graph
imb_dist, imb_mat = wasserstein(h_rep_norm,t,p_ipm,lam=FLAGS.wass_lambda,its=FLAGS.wass_iterations,sq=False,backpropT=FLAGS.wass_bpt)
File "/home/siyuan/git/cfrnet/cfr/util.py", line 230, in wasserstein
Mt = tf.concat(1,[Mt,col])
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1080, in concat
name=name)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 438, in _concat
values=values, name=name)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/siyuan/.virtualenvs/tf012/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [0,10] vs. shape[1] = [1,1]
[[Node: concat_2 = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](concat_2/concat_dim, concat_1, concat)]]
[[Node: gradients/Cast_2/_111 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_588_gradients/Cast_2", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]

A Question about the Noiseless True Effect

Hi Fredrik,

Thanks for your interesting work. I'm recently reading your paper, but have one confuison in my mind that can not go away. The confusion is, in the experiment part, you use the IHDP data set with setting "A" to generate treatment response (which is a linear response surface as declared by Hill in her 2011 BART paper, NPCI package also follows this convention).

As stated in your paper, "Following Hill (2011), we use the noiseless outcome to compute the true effect". Presumably, for one individual, the difference between two noiseless outcome: mu1 - mu0 (mu1 and mu0 for treated and control respectively) should strictly be 4 isn't it? But when I check the data provided, the difference between these two across all the samples is not a constant 4. It seems like the mu1 and mu0 provided also with noise?

I provided snapshot for them as follows:

I understand that the treatment response Y1 and Y0 (with one of them being counterfactual for each individual) come with noise, thus, there difference not necessarily being 4. But why mu1 - mu0 not strictly being 4?

Thanks for any reply!

Can cfrnet use in multiple treatments scene?

In my case, treatments is not binary, but mulitple, eg. T=0,1,2,3. Can I use cfrnet?

the "ground truth" counterfactual control outcome for the treated in Jobs

Thank you for sharing the code.

It is not clear how to get the "ground truth" counterfactual outcome y_0 for the treated in the Jobs dataset.

It will be really helpful if you can upload the processed Jobs dataset so that we can compare with current CFR method with the same benchmark.

clinicalml / cfrnet Goto Github PK

cfrnet's People

Contributors

Stargazers

Watchers

Forkers

cfrnet's Issues

Mode: Train | Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

Mode: Valid | Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

Mode: Test | Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Pehe_nn

Run 1 of 20:

Run 4 of 20:

Recommend Projects

Recommend Topics

Recommend Org

Mode: Train
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

Mode: Valid
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Objective | Pehe_nn

Mode: Test
| Pehe | Bias_ate | Rmse_fact | Rmse_cfact | Rmse_ite | Pehe_nn