Comments (16)
if you want stable bn training, you'd better set batch size to 16 or even larger. but for detection tasks, batch size is always set to 1 or 2, due to memory reasons.
from mobilenet-caffe.
I suggest not training this model from scratch using caffe, since caffe use group
to implement channel-wise convolution, which is very very slow and inefficient.
If possible, you can use lr=1e-3 and wd=1e-4 to finetune the pretrained model for your own task.
from mobilenet-caffe.
If you use the pretrained weights for detection, I sugguest you fixing all the BN parameters by setting lr_mult = 0 and decay_mult = 0.
from mobilenet-caffe.
@ryusaeba btw, to fix all the BN parameters, you should also set use_global_stats: true
in batch_norm_param so as to keep bn mean/variance unchanged during fine tuning stage.
from mobilenet-caffe.
Thanks for your advice!
from mobilenet-caffe.
Hi @shicai
If I would like to fine tune the pretrained model, What number would you suggest for Convolution, BatchNorm and Scale layer? According to your above suggestion, I guess that would be
lr=1e-3 and wd=1e-4 for Convolution.
For BatchNorm, would be shown as below
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 0 ////////////////////////// keep zero, correct?
decay_mult: 0 ////////////////////////// keep zero, correct?
}
param {
lr_mult: 0 ////////////////////////// keep zero, correct?
decay_mult: 0 ////////////////////////// keep zero, correct?
}
param {
lr_mult: 0 ////////////////////////// keep zero, correct?
decay_mult: 0 ////////////////////////// keep zero, correct?
}
}
Scale layer would be
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1 ////////////////////////// is this correct?
decay_mult: 0
}
param {
lr_mult: 1 ////////////////////////// is this correct?
decay_mult: 0
}
scale_param {
filler {
value: 1
}
bias_term: true
bias_filler {
value: 0
}
}
}
Please help me to check param { lr_mult and decay_mult }. Thanks :)
from mobilenet-caffe.
Thanks for your suggestion. I will finetune Convolution layer only and fix all the BN parameters 👍
from mobilenet-caffe.
wow, that is a great helpful reminder. many many thanks :)
from mobilenet-caffe.
@shicai
If we fix all BN parameters (lr_mult=decay_mult=0 and use_global_stats: true), we also don't want to finetune the convolution at base network , right? I have this question is because the mean/variance maybe different when we finetune the convolution at base network.
Please correct me if my understanding is incorrect. Very appreciated.
from mobilenet-caffe.
It's ok to fine tune conv layers when fixing bn parameters, since bn mean/var parameters are not stable during detection training stage.
from mobilenet-caffe.
@shicai
So if our target is classification, the finetune setting for BN parameters would be like
what I posted before and with use_global_stats: false ? (#2 (comment))
from mobilenet-caffe.
yes.
from mobilenet-caffe.
@shicai
Great thanks! Your experience really help me a lots 👍 I am glad to have discussion with you.
from mobilenet-caffe.
@shicai
I have one more question about mean/var parameters. Why are these parameters not stable during detection training stage? Please share your experience with me. Many Thanks:)
Originally, I think this is because detection network use negative sample to do training but I am not sure the real reason.
from mobilenet-caffe.
I think it is mainly because batch size for training detection models is very small.
from mobilenet-caffe.
Could you share me a rough value about batch size? what number is belonging to small or large?
from mobilenet-caffe.
Related Issues (20)
- About Top1 and Top5 result HOT 5
- mobilenet_v2 MAdds 429M vs paper's 300M, because of two difference HOT 1
- Is it recommended to use global batch for fine tuning?
- I wonder something intresting why it's
- image size
- Fine-tune with provided solver leads to loss=87.3365 HOT 6
- The accuracy i test is lower than that in the readme file HOT 1
- Check failed: target_blobs.size() == source_layer->blobs().size() (1 vs. 0) Incompatible number of blobs for layer conv5_5/dw HOT 1
- Some bottlenecks do not contain skip connections in Mobilenet v2?
- MobileNet025
- can you provide the prototxt generator ? thank you very much
- bug in eval_image,py
- caffe可以实现宽度乘法器和分辨率乘法器吗?
- check failure stack trace
- Training imagenet from scratch
- can you share the training code?
- batchnorm layer variance exist negative numbers HOT 3
- TypeError: slice indices must be integers or None or have an __index__ method HOT 4
- Mobilenet-NNIE HOT 2
- Wrong predictions after fine-tuning the mobilenetv2 model HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mobilenet-caffe.