Giter Club home page Giter Club logo

me5406_course_project's Introduction

ME5406 Course Project for Part I

This is a repository which contains simple implementations for ME5406 Deep learning for robotics

Problem description

Consider a RL based robot is in a grid frozen lake environment, where the goal of the robot is to pick up the target while avoid falling into the ice holes

frozenlake

RL algorithm

  1. First visit monte carlo
  2. SARSA
  3. Q-learning

Requirements

  • numpy
  • matplotlib
  • tkinter
  • PIL

Usage

Train

Need to define the RL agent type ('mc'/'sarsa'/'ql'), the grid world size (4/10), the number of training epochs.

For example, if you want to train q-learning with 4x4 grid world with 10000 epochs, then you could use

python train.py --agent 'lq' --grid_size 4 --num_epoch 10000

Test

Need to define the test type, which can be described as below

  • Job 0: 4x4 frozen lake environment training, correctness test, and comparison test among algorithms
  • Job 1, 10X10 frozen lake environment training, correctness test, and comparison test among algorithms
  • Job 2, comparison test for different learning rate value settings
  • Job 3, comparison test for different gamma value settings
  • Job 4, comparison test for different epsilon value settings

If you want to test with Job 0, you could use

python train.py --job 0 --num_epoch 10000

Training Results

  • First visit Monte Carlo

    • 4x4 grid world
    • 10x10 grid world
  • SARSA

    • 4x4 grid world
    • 10x10 grid world
  • Q-learning

    • 4x4 grid world
    • 10x10 grid world
  • Final policy

    • 4x4 grid world From left to right: Monte Carlo, SARSA, Q-learning

    • 10x10 grid world From left to right: Monte Carlo, SARSA, Q-learning

Testing Results

  • Algorithm comparisons

    • 4x4 grid world
    • 10x10 grid world
  • Learning rate Red/0.01, Green/0.001, Blue/0.0001

  • Gamma Red/0.8, Green/0.9, Blue/0.99

  • Epsilon Red/0.7, Green/0.8, Blue/0.9

Note: I didn't put all the plots here, but you can find all of them in the ./Results file

Reference

Morvan Zhou

Acknowledgement

Don't forget to give me a star if you like this! ๐Ÿ˜Š

me5406_course_project's People

Contributors

zhangyifengdavid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

musavimariam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.