Giter Club home page Giter Club logo

course_rl's Introduction

Homework description

I run Ant-v2 and Walker2d-v2 environment in DAgger and Behavior Cloning method.

Tensorboard files are located at /data directory.

I plot two pictures to compare the performance of above two methods and expert policy, which are located at /data directory.

Run the code

Tip: While debugging, you probably want to keep the flag --video_log_freq -1 which will disable video logging and speed up the experiment. However, feel free to remove it to save videos of your awesome policy!

If running on Colab, adjust the #@params in the Args class according to the commmand line arguments above.

Section 1 (Behavior Cloning)

Command for problem 1:

python cs285/scripts/run_hw1.py \
	--expert_policy_file cs285/policies/experts/Ant.pkl \
	--env_name Ant-v2 --exp_name bc_ant --n_iter 1 \
	--expert_data cs285/expert_data/expert_data_Ant-v2.pkl
	--video_log_freq -1

Make sure to also try another environment. See the homework PDF for more details on what else you need to run. To generate videos of the policy, remove the --video_log_freq -1 flag.

Section 2 (DAgger)

Command for section 1: (Note the --do_dagger flag, and the higher value for n_iter)

python cs285/scripts/run_hw1.py \
    --expert_policy_file cs285/policies/experts/Ant.pkl \
    --env_name Ant-v2 --exp_name dagger_ant --n_iter 10 \
    --do_dagger --expert_data cs285/expert_data/expert_data_Ant-v2.pkl \
	--video_log_freq -1

Make sure to also try another environment. See the homework PDF for more details on what else you need to run.

Visualization the saved tensorboard event file:

You can visualize your runs using tensorboard:

tensorboard --logdir data

You will see scalar summaries as well as videos of your trained policies (in the 'images' tab).

You can choose to visualize specific runs with a comma-separated list:

tensorboard --logdir data/run1,data/run2,data/run3...

If running on Colab, you will be using the %tensorboard line magic to do the same thing; see the notebook for more details.

course_rl's People

Contributors

dc-wangjn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.