marcometer / nerorl Goto Github PK
View Code? Open in Web Editor NEWDeep Reinforcement Learning Framework done with PyTorch
License: MIT License
Deep Reinforcement Learning Framework done with PyTorch
License: MIT License
Hello,
Thank you for the nice implementation. This is more of a question rather than a problem with code. I am finding it quite hard to nail down what the process is when using LSTM cells in the policy. Assume you have a sequence length of 10 timesteps and you sample 128 sequences. That means your observation is 10, 128, state_dim. Now your actions and log_probs will also be of 10, 128, 1. Am I right on this? Further do you then leave those dimensions as is and calculate the loss? Can you elaborate on the process a bit more on how to construct the loss when using these dimensions?
If you could elaborate a bit on the process I would greatly appreciate it!
Shall the training data be arranged in sequences of fixed length?
Is it sufficient to arrange the data into their respective episodes? (Well, that could cause inconsistent mini batch sizes or an episode might be incomplete)
What if a sequence ends up being shorter due to episode termination?
If those sequences are padded to maintain the fixed sequence length, doesn’t that increase the overall batch size (i.e. number of training samples)?
Upon optimizing the model, is it smarter to recompute the hidden states as they are modified by the optimization? (the originally hidden states from the sampling phase might be deprecated already after processing the first mini batch)
What if a sequence/episode was not completed during the data sampling phase?
How can you check in PyTorch how far the gradients are being backpropagated through the hidden states in the training data?
from torchviz import make_dot
graph = make_dot(final_tensor)
graph.view()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.