Comments (7)
@Kaixhin, thank you for your input. I will look into the chainerrl and OpenAI baselines to get more insight about the implementation. I will just send a PR if needed.
Thanks
from acer.
The trust region function should be a translation of the one from ChainerRL, apart from the fact that the trust region involves some parameters, so if you think it should be the other way around then you should raise an issue there to let them know too.
from acer.
I think since we have already defined the kl divergence before, we do not really need the negative sign. I am not sure why there is a negative sign.
Please let me know your views. Do you think it should be the other way round?
from acer.
I'm not sure either, I just ported the code from Chainer for this part.
from acer.
Let me know the difference in your implementation and OpenAI baselines.
from acer.
I've not compared the two at all, and this is very low priority for me at the moment.
from acer.
I am closing this issue because the current implementation seems to be correct. However, I think we need to detach the z_star_p for better results. Please refer to #13 .
from acer.
Related Issues (13)
- _trust_region_loss variations HOT 12
- feed the previous action to lstm HOT 3
- batch_size for off-policy learning HOT 4
- KL Divergence HOT 3
- Doubts on Episodic Memory HOT 1
- Doubts on memory
- Trust Region Updates HOT 2
- Mcelog
- Doubt about gradient transfer to shared model HOT 1
- Configurations for Atari games HOT 4
- Hello I need help to fix my audio realtek, windows 10 home. My pc is Acer predator Helios 300, I have tried downloading new one from the website but it does not work, please any recommendations, I need help to fix my audio driver
- the code of the off-policy bias correction HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from acer.