Light

mld3 / rl-set-valued-policy Goto Github PK

[ICML 2020] Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies. https://arxiv.org/abs/2007.12678, https://icml.cc/virtual/2020/poster/5797

Home Page: http://proceedings.mlr.press/v119/tang20c.html

Jupyter Notebook 94.41% MATLAB 3.13% Python 2.46%

reinforcement-learning icml-2020 healthcare-application human-in-the-loop

rl-set-valued-policy's Introduction

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Source code submission for ICML 2020

Notes

The implementation of the main algorithms can be found under synthetic/lib
Run notebooks in synthetic to recreate all synthetic experiments (Figures 1-5)
Follow instructions in mimic-sepsis to recreate the real data experiment (Figure 6-7)

rl-set-valued-policy's People

Contributors

Stargazers

Watchers

Forkers

deepakgopinath mahdiehnejati eanmin

rl-set-valued-policy's Issues

Clarification for qlearn_Q.npy

Hello! The notebooks for evaluation in 2_Learn/2-5_eval-* all load the numpy object Q_star = np.load('qlearn_Q.npy'). However, in the Q-learning code, the learned Q-function is saved as np.save('L_qlearn_Q.npy', Q).
(from https://github.com/shengpu1126/RL-Set-Valued-Policy/blob/1f76cc789af96686dc34011ff28502f96d1565dc/mimic-sepsis/mimic_sepsis_rl/2_learn/2-3L_qlearn.py#L56 )

Should both qlearn_Q.npy and L_qlearn_Q.npy refer to the same object? I'm assuming not, because when I try to load Q_star = np.load('L_qlearn_Q.npy') in the eval notebook, the next step of the notebook throws an error due the assert statement in the soften-policies code (https://github.com/shengpu1126/RL-Set-Valued-Policy/blob/1f76cc789af96686dc34011ff28502f96d1565dc/mimic-sepsis/mimic_sepsis_rl/2_learn/2-5_eval-DR-WDR-err-Qstar.ipynb#L94)

It looks like something is missing. Am I misunderstanding something?

Thanks for the help!

Instructions unclear on how train/val/test trajectories are generated

There are two notebooks in 1_preprocess for generating trajD_tr.pkl, trajD_va.pkl, trajD_te.pkl. It is unclear which one between between 1_Z_reformat_data.ipynb vs 1_Z_reformat_data_discrete.ipynb is being used to generate those files.

It looks like 1_Z_reformat_data_discrete.ipynb is being used, but it'll be great if you can clarify this.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.