Giter Club home page Giter Club logo

cs294_homework's Introduction

CS 294-112 homework (offered in Fall of 2017)

This is my github repo for homework for CS294 (offered in Fall 2017). I covered this course remotely (using lecture notes and videos) and implemented the coding parts of the homework. Below are synopses for what I implemented for each homework assignment.

Disclaimer: this code is for educational purposes only. Students taking current iterations of this course should refrain from copying this code, as that would breach academic integrity and hamper their own education.

Dependencies

Note that some of these dependencies were not released at the time of this course. Furthermore, the starter code has been modified to reflect changes in OpenAI Gym's documentation.

Homework 1

The course, up to this point, has covered more basic supervised learning. I implemented BC (behavior cloning) and DAgger (Dataset Aggregation), which improved the results (slightly). I also experimented with various hyperparameters.

Homework 2

I implemented the policy gradient algorithm and ran some tests on various environments. I played with the hyperparameters and saw that my implementation caused the agent's reward to converge to the theoretical value. I also implemented GAE (generalized advantage estimation) and compared its results.

Homework 3

I implemented the DQN algorithm and ran it on the Atari Pong simulator. I experimented with different hyperparameters and saw that my model converged to the perfect value.

Homework 4

I implemented the MPC algorithm. However, I was unable to run the provided HalfCheetahEnvNew as it threw

'mujoco_py.cymj.PyMjModel' object has no attribute 'data'

Furthermore, when I attempted to work with the given 'HalfCheetah-v2' environment that (in terms of raw code) is isomorphic to the HalfCheetahEnvNew, the action dimensions representing

- rootx     slider      position (m)
- rootz     slider      position (m)
- rooty     hinge       angle (rad)
- bthigh    hinge       angle (rad)
- bshin     hinge       angle (rad)
- bfoot     hinge       angle (rad)
- fthigh    hinge       angle (rad)
- fshin     hinge       angle (rad)
- ffoot     hinge       angle (rad)
- rootx     slider      velocity (m/s)
- rootz     slider      velocity (m/s)
- rooty     hinge       angular velocity (rad/s)
- bthigh    hinge       angular velocity (rad/s)
- bshin     hinge       angular velocity (rad/s)
- bfoot     hinge       angular velocity (rad/s)
- fthigh    hinge       angular velocity (rad/s)
- fshin     hinge       angular velocity (rad/s)
- ffoot     hinge       angular velocity (rad/s)

Aren't correctly represented in the loss function (the comments about what each part represents don't match up). Furthermore, for some strange reason, all HalfCheetah environments load in 17 variables, not 18.

cs294_homework's People

Contributors

louaaron avatar joschu avatar cbfinn avatar abhishekunique avatar svlevine avatar carmezim avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.