Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Train network with no bias in convolution layer about mxnet-ssd HOT 10 OPEN

ndcuong91 commented on August 19, 2024

Train network with no bias in convolution layer

from mxnet-ssd.

Comments (10)

zhreshold commented on August 19, 2024 1

you can set lr_mult of batchnorm beta term to 0 to fix the beta, which is initialized as 0.

from mxnet-ssd.

zhreshold commented on August 19, 2024 1

@titikid You can leave gamma unfixed or not, depending your result, but I would prefer leave it free.

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

Thanks @zhreshold , it worked!

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

@zhreshold is this also fix 'gamma' term?
Should i remove 'fix_gamma=True' in bn layer?

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

I already trained 2 models from scratch, all parameters is set as default (lr=0.004, batch=48, single gpu)

model with fixed beta (only in base mobilenet network) and gamma: ~41.5% mAP after 220 epoches. Train log here
model with fixed beta (only in base mobilenet network): ~42% mAP after 220 epoches. Train log here
@zhreshold Can you take a look and give me some tips for better mAP? should i train with bigger dataset first and fine-tune in voc2007/2012?

from mxnet-ssd.

zhreshold commented on August 19, 2024

you have to use ImageNet pre-trained weights, otherwise you need a DSSD variant.

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

Hi @zhreshold
I found that if i remove "beta" term only, the convolution still has a small shift factor because the impact of "running_mean" term. i set "lr_mult" of "running_mean" term to 0 but i still see it updated during training. So, how can i completely remove it?

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

@zhreshold can you give me some suggestion?

from mxnet-ssd.

zhreshold commented on August 19, 2024

@titikid For maximum flexibility I suggest you to use broadcast multiply instead of batchnorm itself. You have full control of how the behavior is without hacking batchnorm itself.

from mxnet-ssd.

ndcuong91 commented on August 19, 2024

@zhreshold i'm not really clear what do you mean for now, but i will investigate it. Thanks!

from mxnet-ssd.

Recommend Projects

Train network with no bias in convolution layer about mxnet-ssd HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent