Giter Club home page Giter Club logo

mld3 / rl-set-valued-policy Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 3.0 2.62 MB

[ICML 2020] Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies. https://arxiv.org/abs/2007.12678, https://icml.cc/virtual/2020/poster/5797

Home Page: http://proceedings.mlr.press/v119/tang20c.html

Jupyter Notebook 94.41% MATLAB 3.13% Python 2.46%
reinforcement-learning icml-2020 healthcare-application human-in-the-loop

rl-set-valued-policy's Introduction

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Source code submission for ICML 2020

Notes

  • The implementation of the main algorithms can be found under synthetic/lib
  • Run notebooks in synthetic to recreate all synthetic experiments (Figures 1-5)
  • Follow instructions in mimic-sepsis to recreate the real data experiment (Figure 6-7)

rl-set-valued-policy's People

Contributors

shengpu-tang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rl-set-valued-policy's Issues

Clarification for qlearn_Q.npy

Hello! The notebooks for evaluation in 2_Learn/2-5_eval-* all load the numpy object Q_star = np.load('qlearn_Q.npy'). However, in the Q-learning code, the learned Q-function is saved as np.save('L_qlearn_Q.npy', Q).
(from https://github.com/shengpu1126/RL-Set-Valued-Policy/blob/1f76cc789af96686dc34011ff28502f96d1565dc/mimic-sepsis/mimic_sepsis_rl/2_learn/2-3L_qlearn.py#L56 )

Should both qlearn_Q.npy and L_qlearn_Q.npy refer to the same object? I'm assuming not, because when I try to load Q_star = np.load('L_qlearn_Q.npy') in the eval notebook, the next step of the notebook throws an error due the assert statement in the soften-policies code (https://github.com/shengpu1126/RL-Set-Valued-Policy/blob/1f76cc789af96686dc34011ff28502f96d1565dc/mimic-sepsis/mimic_sepsis_rl/2_learn/2-5_eval-DR-WDR-err-Qstar.ipynb#L94)

It looks like something is missing. Am I misunderstanding something?

Thanks for the help!

Instructions unclear on how train/val/test trajectories are generated

There are two notebooks in 1_preprocess for generating trajD_tr.pkl, trajD_va.pkl, trajD_te.pkl. It is unclear which one between between 1_Z_reformat_data.ipynb vs 1_Z_reformat_data_discrete.ipynb is being used to generate those files.

It looks like 1_Z_reformat_data_discrete.ipynb is being used, but it'll be great if you can clarify this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.