View Code? Open in Web Editor NEW

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Published at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Jupyter Notebook 7.75% Python 92.25%

osrl-sc's Introduction

Code for On the Sample Complexity of Representation Learning in Multi-Task Bandits with Global and Local Structure

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Accepted at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Author: Alessio Russo

The code contains not only the algorithm mentioned above, but also KL-UCB [1], D-Track and Stop/D-Track and Stop with challenger modification [2].

All the code has been written in Python or C.

Hardware and Software setup

All experiments were executed on a stationary desktop computer, featuring an Intel Xeon Silver 4110 CPU, 48GB of RAM. Ubuntu 18.04 was installed on the computer. Ubuntu is a open-source Operating System using the Linux kernel and based on Debian. For more information, please check https://ubuntu.com/.

Code and libraries

We set up our experiments using the following software and libraries:

Python 3.7.7
Cython version 0.29.15
NumPy version 1.18.1
SciPy version 1.4.1
PyTorch version 1.4.0

All the code can be found in the folder src.

Usage

You can run sample simulations by running the Jupyter notebooks located in the folder notebooks.

To run the notebooks you need to install Jupyter first. After that, you can open a shell in the notebooks directory and run

jupyter notebook

This will open the jupyter interface, where you can select which file to run.

License

MIT license.

References

[1] Garivier, Aurélien, and Olivier Cappé. "The KL-UCB algorithm for bounded stochastic bandits and beyond." Proceedings of the 24th annual conference on learning theory. 2011. [2] Garivier, Aurélien, and Emilie Kaufmann. "Optimal best arm identification with fixed confidence." Conference on Learning Theory. 2016.

Recommend Projects

rssalessio / osrl-sc Goto Github PK

osrl-sc's Introduction

Code for On the Sample Complexity of Representation Learning in Multi-Task Bandits with Global and Local Structure

Hardware and Software setup

Code and libraries

Usage

License

References

osrl-sc's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent