Giter Club home page Giter Club logo

rssalessio / osrl-sc Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 22 KB

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Published at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Jupyter Notebook 7.75% Python 92.25%
multi-task python representation-learning sample-complexity aaai23 osrl-sc

osrl-sc's Introduction

Code for On the Sample Complexity of Representation Learning in Multi-Task Bandits with Global and Local Structure

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Accepted at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Author: Alessio Russo

The code contains not only the algorithm mentioned above, but also KL-UCB [1], D-Track and Stop/D-Track and Stop with challenger modification [2].

All the code has been written in Python or C.

Hardware and Software setup

All experiments were executed on a stationary desktop computer, featuring an Intel Xeon Silver 4110 CPU, 48GB of RAM. Ubuntu 18.04 was installed on the computer. Ubuntu is a open-source Operating System using the Linux kernel and based on Debian. For more information, please check https://ubuntu.com/.

Code and libraries

We set up our experiments using the following software and libraries:

  • Python 3.7.7
  • Cython version 0.29.15
  • NumPy version 1.18.1
  • SciPy version 1.4.1
  • PyTorch version 1.4.0

All the code can be found in the folder src.

Usage

You can run sample simulations by running the Jupyter notebooks located in the folder notebooks.

To run the notebooks you need to install Jupyter first. After that, you can open a shell in the notebooks directory and run

jupyter notebook

This will open the jupyter interface, where you can select which file to run.

License

MIT license.

References

[1] Garivier, Aurélien, and Olivier Cappé. "The KL-UCB algorithm for bounded stochastic bandits and beyond." Proceedings of the 24th annual conference on learning theory. 2011. [2] Garivier, Aurélien, and Emilie Kaufmann. "Optimal best arm identification with fixed confidence." Conference on Learning Theory. 2016.

osrl-sc's People

Contributors

rssalessio avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.