Giter Club home page Giter Club logo

eipo's Introduction

Overview

This official codebase for "Redeeming Intrinsic Rewards via Constrained Optimization" (NeurIPS'22).

Usage

  1. Clone this repo.

  2. If you plan on using mujoco, place your license key "mjkey.txt" in the base directory. This file will be copied in when you start docker using the Makefile command.

  3. Make sure you have docker installed to run the image. We recommend running the GPU image which will work even if you are only using CPUs (labeled version_gpu), but a CPU only image is provided as well.

  4. Edit global.json to customize any volume mount points, port forwarding, and docker image versions from the registry. Information from this file is read into the Makefile.

  5. The makefile contains some basic commands (we use node to read in information from global.json at the top - it's not used for anything else).

make start_docker # start the docker container and drop you in a shell
make start_docker_gpu # start the docker container if running on a machine with GPUs
make stop_docker # stop the docker container and clean up
make clean # clean all subdirectories of pycache files etc.
  1. Before running anything, make sure you create an empty directory titled "results" in the base directory.

  2. Run experiment via the following command

python experiments/diff_adapt_minmax/run.py --mode local --gpus 0

Notes

Our codebase is based on curiosity_baselines and rlpyt For more information on the original rlpyt codebase, please see this white paper on Arxiv.

Code Organization

The class types perform the following roles:

  • Runner - Connects the sampler, agent, and algorithm; manages the training loop and logging of diagnostics.
    • Sampler - Manages agent / environment interaction to collect training data, can initialize parallel workers.
      • Collector - Steps environments (and maybe operates agent) and records samples, attached to sampler.
        • Environment - The task to be learned.
          • Observation Space/Action Space - Interface specifications from environment to agent.
        • TrajectoryInfo - Diagnostics logged on a per-trajectory basis.
    • Agent - Chooses control action to the environment in sampler; trained by the algorithm. Interface to model.
      • Model - Torch neural network module, attached to the agent.
      • Curiosity Model - Torch neural network module, attached to the model which is attached to the agent.
      • Distribution - Samples actions for stochastic agents and defines related formulas for use in loss function, attached to the agent.
    • Algorithm - Uses gathered samples to train the agent (e.g. defines a loss function and performs gradient descent).
      • Optimizer - Training update rule (e.g. Adam), attached to the algorithm.
      • OptimizationInfo - Diagnostics logged on a per-training batch basis.

Sources and Acknowledgements

This codebase is currently funded by Amazon MLRA - we thank them for their support.

Parts of the following open source codebases were used to make this codebase possible. Thanks to all of them for their amazing work!

Thanks to Prof. Pulkit Agrawal and the members of the Improbable AI lab at MIT CSAIL for their continued guidance and support.

eipo's People

Contributors

williamd4112 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.