Comments (1)
If you have calculated the loss in the same way as my code, it is normal that loss is about 2.5 (loss 2.5 equals to perplexity 11.97). This means that the model is well trained. I think that something went wrong during the evaluation phase. Below are some good things to try.
-
Add batch normalization to the top layer of CNN.
This makes training more stable. See here. -
Use 0.01 for momentum in BN
The default value of the momentum in BN is 0.1, which can not calculate moving average/variance well for the entire training dataset. -
Change the cnn model to eval mode before sampling the caption
Assume that the model is training mode and you give 1 image to the model to generate the caption. In training mode, BN uses mini-batch statistics which makes all activation values 0 (for one image). To properly generate the caption, give the model multiple images, or change it to eval mode.See here for how to change the model to eval mode.
I hope this helps.
from pytorch-tutorial.
Related Issues (20)
- Issues in running tensorboard tutorial HOT 1
- Initialize DecoderCNN in Image captioning
- Some problems occurred when I used model evaluation
- RuntimeError in Logistic Regression python file
- Using LSTM method in Python
- size mismatch for pretrained models HOT 2
- pytorch
- No Jupyter Notebooks. HOT 1
- About the learning method of neural_style_transfer
- Does anyone know the source code of channel calculation in pytorch?
- make ur repo cloneable and not editable by anyone.
- TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple HOT 1
- AttributeError: module 'torch.nn' has no attribute 'linear' HOT 2
- ValueError: num_samples should be a positive integer value, but got num_samples=0 HOT 1
- main.py failed HOT 2
- some question about the position of 'optimizer.zero_grad()' HOT 4
- Pytorch tutorial HOT 1
- How can I get a PDF version of the tutorial HOT 2
- Cuda is true why don`t use itοΌ
- GNN model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-tutorial.