Comments (10)
Thanks for the update, I have some naive questions regrading your solver.prototxt setting
- why you set the base_lr: 0.0001, have you tried a higher learning rate?
- you experiment with this learning rate through whole experiment, is that correct?
from caffe-vdsr.
another concern is the time consuming. In original paper, in the paper, the author said it took them 4 hours to finish experiment with Titan Z.
I experiment with your code for over 1 hour, and it only goes to iteration 10000, which means it needs 93 more hours to get your max_iter: 935840 ? my GPU is also Titan Z
from caffe-vdsr.
(1) I use Adam instead of SGD, 0.0001 or 0.001 is general choice, and you don't need to decrease learning rate.
(2) About time consuming, there are some differences between original paper and this implementation.
First, the author used MatconvNet and this implementation used Caffe. There maybe some speed differences.
Second, the author didn't mention how many gpus they used for training.
Third, it is impossible to train 80 epochs (9960 iterations with batch size 64) in 4 hours with single Titan Z. You can try with SGD and compute the time.
Thanks.
from caffe-vdsr.
why you use Adam instead of SGD here? will this achieves better performance? or faster training?
from caffe-vdsr.
(1) First, if you use SGD with high learning rate like original paper, you need to set clip_gradient, and I can not achieve good performance with a simple setting of the value of clip_gradient. Adjust the value of clip_gradient is time consuming and meaningless.
(2) Second, the convergence of Adam is faster than SGD in begining.
Thanks.
from caffe-vdsr.
got it, thanks
from caffe-vdsr.
my experiment goes to iteration 85000 now, according to my experiment, the model saturate at around iteration 20000 (I just experiment with factor 4). In your experiment, when will the model saturate? did you get the max_iter: 935840 ?
from caffe-vdsr.
Hi,
(1) since the number of iteration is depend on the number of samples, I recommend you to convert the iteration number to epoch number, thus epoch = iteration * batch_szie / sample_numbers.
In my multi scale experiments, the total sample number is about 748000. So I set max_iter = 80 (epoch) * 748000 / 64 (batch_size) = 935000.
If you only experiment with factor 4, the sample number should be about 748000 / 3 = 250000. And you should set max_iter = 80 (epoch) * 250000 / 64 (batch_size) = 312500.
(2) In my experiments, the model saturate at about 20-30 epoch (you can check the training log for details). The final model I uploaded is about 50 epoch. But I didn't do single scale experiment with Adam. And I recommend you to test the PSNR (no test loss) of the trained model to check if the model is saturated.
(3) The total time in my experiment (80 epoch) is about 31 hours with single Titan X (Old).
My english is not very good, hope this helps. Thanks.
from caffe-vdsr.
that's really helpful, thanks
from caffe-vdsr.
I have uploaded the log of my experiments. You can check it for more details. @mrxue1993
from caffe-vdsr.
Related Issues (20)
- Caffemodels corresponding to VDSR_official.mat and VDSR_ADAM.mat HOT 3
- This version isn't use the clip-gradients? HOT 2
- about the data sample HOT 3
- loss during training HOT 9
- Test Function at Caffe HOT 2
- Error in VDSR_Matconvnet (line 15) HOT 4
- A display bug in Demo_SR_Conv.m ?
- how to get high resolution output by using my own image data?
- Questions about parameters HOT 2
- sr_psnr less than bicubic psnr HOT 6
- Test code in pyCaffe/C++ HOT 7
- About learning rate HOT 2
- File for training data is offline – alternative location? HOT 2
- Data Preprocessing HOT 2
- About GPU load
- loss nan
- Thanks and some questions HOT 1
- Questions about usage HOT 1
- Some questions about DEM(>255)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffe-vdsr.