Giter Club home page Giter Club logo

weekend-deeprl's Introduction

Reinforcement Leaning Tutorial

About

Weekend Deep Reinforcement Learning (DRL) is a self-study of DRL in my free time. DRL is very easy, especially when you already have a bit background in Control and Deep Learning. Even without the background, the concept is still very simple, so why not study and have fun with it.

My implementation aims to provides a minimal code implementation, and short notes to summarize the theory.

  • The code, modules, and config system are written based on mmcv configs and registry system, thus very easy to adopt, adjust components by changing the config files.
  • Lecture Notes: No lengthy math, just the motivation concept, key equations for implementing, and a summary of tricks that makes the methods work. More important, I try to make the connection with previous methods as possible.

My learning strategy is to go directly to summarize and implement the papers, starting from the basic one. I hate the fact that most of the books in RL always start with very heavy theory background, asking us to remember many vague definitions, such as what is On-Line, Off-Line, Policy Gradient, etc. NO, NO, NO !!! Let play with the basic blocks first. When we feel comfortable, just recap and introduce these concepts later. It is absolutely fine if you don't remember these definitions at all.

Following are the great resource that I learn from:

1. Env Setup:

conda create -n RL --python=3.8 -y
conda install tqdm mathplotlib scipy
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
pip install gym 
pip install gym[all] #Install the environment dependence
# or pip install cmake 'gym[atari]'
pip install pybullet

2. Try Gym environment

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset() # Before start, reset the environment 
    for t in range(100):
        env.render()            
        print(observation)
        action = env.action_space.sample() # This is where your code should return action
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()
  • Every environment comes with an env.action_space and an env.observation_space.
  • List all available environments: gym.envs.registry.all().

3. Algorithms:

Paper ranking:

  • ๐Ÿ† Must known benchmark papers.
  • ๐Ÿš€ Improved version of benchmark papers. Come back after finishing the benchmark papers.
  1. Q-Learning: Introduction to RL with Q-Learning
  2. Deep Q-Learning:
  3. Actor-Critic methods:
  4. Recap and overview of RL methods:
  5. Policy Gradient:
  6. How to deal with Sparse Reward for Off-Line learning:
  7. On-Line Policy (TBD)
  8. Model-Based Learning (TBD)
  9. Multi-Agent Learning (TBD)

4. Usage:

Except the first Q-Learning tutorial, that is for RL introduction, all other methods can be easily trained as:

python tools/train.py [path/to/config.py] [--extra_args]

For example, to train a Deep Q-Learning (DQN) for mountain car env, use:

python tools/train.py configs/DQN/dqn_mountain_car.py

weekend-deeprl's People

Contributors

chuong98 avatar

Stargazers

igeng84 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.