Comments (3)
Hi,
The RL part is a common implementation of policy gradient with baseline. The overall implementation is aligned with the formulation. Can you be more specific on your questions?
Thanks,
from arel.
Sorry, my previous statement was not clear. My questions are as follows.
- About the loss of reward net. In your paper, the objective of reward function is to minimize the exception of reward under empirical distribution subtracting the reward under policy network' s distribution. But in your code, the sign of loss (train_AREL.py, 138 line) is just on the opposite.
loss = -torch.sum(gt_score) + torch.sum(gen_score)
. Why? - About the loss of policy net. Variable opt.rl_weight is used in calculating the loss. What the
meaning of variable loss and tf-loss?
loss = opt.rl_weight * loss + (1 - opt.rl_weight) * tf_loss
Looking forward to your reply! Thx.
from arel.
- In the paper, we show the objective functions to be maximized (gradient ascent). In practice we usually minimize the loss functions with gradient descent instead. But they are indeed equivalent.
tf_loss
is the cross entropy loss to help stabilize the training.
from arel.
Related Issues (20)
- About the RL training option
- What is the model-best.pth in the repo? HOT 2
- KeyError: "tensor(159, device='cuda:0')" HOT 2
- Could you share the trained Reward Model? HOT 1
- Resuming doesn't work HOT 7
- Number of epochs? HOT 1
- GAN vs AREL? HOT 1
- Unzip resnet features error HOT 1
- EOFException happened when unzip the ResNet-152 features file HOT 2
- Training Time HOT 1
- the purpose of mask in train_AREL.py
- how to get a new pretrained embedding.npy HOT 1
- The results of AREL (best) in Table 2 HOT 1
- Checkpoint after AREL training HOT 1
- Some question about "sample_results" files
- Reward calculated for training Generator?
- what kind of the loss does the LanguageModelCriterion calculate? HOT 2
- The pretrained resnet152 from torchvision doesn't work HOT 5
- About the detailed options HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arel.