cyoon1729 / rlcycle Goto Github PK

View Code? Open in Web Editor NEW

299.0 13.0 63.0 3.6 MB

A library for ready-made reinforcement learning agents and reusable components for neat prototyping

License: MIT License

Python 99.14% Makefile 0.05% Shell 0.82%

reinforcement-learning pytorch a2c ddpg hydra a3c dqn quantile-regression sac rainbow-dqn

rlcycle's Introduction

About me at chrisyoon.xyz

rlcycle's People

Contributors

Stargazers

Watchers

rlcycle's Issues

A2C loss calculations

https://github.com/cyoon1729/Reinforcement-learning/blob/d157d0d86c37734be4b430b7d311eb9bb0379d93/Policy-Gradient-Methods/a2c/a2c.py#L42

Sir, can you please explain why value_targets = rewards + discounted_rewards ?
Why do you need to add rewards and discounted rewards together ?

Why are there two q_net in the sac2018.py file?

This is a great job :)

I am confused about sac2018.py implementation，Why are there two q_net in the sac2018.py file? (code link)
I did not see this description in the original paper Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

In addition, I want to verify one thing. If I understand correctly, sac2018.py is the implementation of Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, and sac2019.py is the implementation of Soft Actor-Critic Algorithms and Applications. Looking forward to your reply.

A2C save next_state to trajectory object

https://github.com/cyoon1729/Reinforcement-learning/blob/d157d0d86c37734be4b430b7d311eb9bb0379d93/Policy-Gradient-Methods/a2c/a2c_test.py#L25

No need to save next_state, you never use it.

Agent initialize only synchronous Critic?

Copy critic target parameters

    for target_param, param in zip(self.critic_target.parameters(), self.critic.parameters()):
        target_param.data.copy_(param.data)

Actor need to sync?

Weird memory leak behavior

DQN-based algorithms have memory leak while training on Atari. The leak doesn't always happen (inconsistent), or happens suddenly in the middle of training.

Documentation

Hi,
Do you can add some documentation to your code? Or a HowTo? I was running your code but dont see that you save the learned model right? Because i would like to test it.

cyoon1729 / rlcycle Goto Github PK

rlcycle's Introduction

rlcycle's People

Contributors

Stargazers

Watchers

Forkers

rlcycle's Issues

A2C loss calculations

Why are there two q_net in the sac2018.py file?

A2C save next_state to trajectory object

Agent initialize only synchronous Critic?

Copy critic target parameters

Weird memory leak behavior

Documentation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent