As discussed on issue <a class="issue-link js-issue-link" data-error-text="Failed to l

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey again, The notebook is on a separate branch <a href="https://git

This issue was partly addressed in PR <a class="issue-link js-issue-link" data-error-t

Verify model performance based on benchmark data about causalml HOT 4 CLOSED

uber commented on May 10, 2024

Verify model performance based on benchmark data

from causalml.

Comments (4)

MaximilianFranz commented on May 10, 2024 3

Hey @t-tte, I've started working on the mentioned comparison using our JustCause framework and wanted to highlight a few ideas/notes/problems.

I've started using rpy2 in order to use the original rlearner implementation directly from Python. However, as I am not familiar with the method in detail, it is hard to know, how to use the RLearner from causalml in order to get the closest possible imitation. (As the goal for me would be to show how the R implementation fares against the Python implementation)
I've reimplemented the benchmarking data (A to D) from Nie and Wager inside the JustCause framework to allow easy usage.

First results are a little confusing as in one case (boost) the R implementation is significantly better while in the other (lasso) the Python Version is much better. So much even, that there seems to be some mistake.. (using only 100 replications of the IHDP dataset, the simulation data from Nie and Wager will follow):

I'll publish the notebook soon and let you know about the progress I'll make on this. However, I'll limit myself to 1 or 2 benchmark datasets.

from causalml.

t-tte commented on May 10, 2024

Hi @MaximilianFranz

Thanks for the note! Great to hear you've started working on this, and that you've implemented some of the simulations in JustCause. I'll have a closer look and will keep you updated.

As to the differences between the methods, that's indeed a tricky one. My first two hypotheses would be:

The base-learners or some of their parameters are set differently
The propensity score is calculated differently

If you share your notebook, I'll be very happy to dig deeper to see where the root cause of the differences might be.

from causalml.

MaximilianFranz commented on May 10, 2024

Hey again,

The notebook is on a separate branch here, since we're still changing stuff within the JustCause API. I haven't looked at it in detail yet, but will return to it once we've finished our fully documented version.

from causalml.

t-tte commented on May 10, 2024

This issue was partly addressed in PR #443. As I have no immediate plans to work on the remaining simulations, I'm closing the issue.

from causalml.

Recommend Projects

Verify model performance based on benchmark data about causalml HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent