d-li14 / mobilenetv2.pytorch Goto Github PK

View Code? Open in Web Editor NEW

671.0 11.0 188.0 121.15 MB

72.8% MobileNetV2 1.0 model on ImageNet and a spectrum of pre-trained MobileNetV2 models

Home Page: https://arxiv.org/abs/1801.04381

License: Apache License 2.0

Python 100.00%

deep-neural-networks pytorch pretrained-models mobilenetv2 imagenet cvpr2018

mobilenetv2.pytorch's Introduction

PyTorch Implemention of MobileNet V2

+ Release of next generation of MobileNet in my repo *mobilenetv3.pytorch*
+ Release of advanced design of MobileNetV2 in my repo *HBONet* [ICCV 2019]
+ Release of better pre-trained model. See below for details.

Reproduction of MobileNet V2 architecture as described in MobileNetV2: Inverted Residuals and Linear Bottlenecks by Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov and Liang-Chieh Chen on ILSVRC2012 benchmark with PyTorch framework.

This implementation provides an example procedure of training and validating any prevalent deep neural network architecture, with modular data processing, training, logging and visualization integrated.

Requirements

Dependencies

PyTorch 1.0+
NVIDIA-DALI (in development, not recommended)

Dataset

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Pretrained models

The pretrained MobileNetV2 1.0 achieves 72.834% top-1 accuracy and 91.060% top-5 accuracy on ImageNet validation set, which is higher than the statistics reported in the original paper and official TensorFlow implementation.

MobileNetV2 with a spectrum of width multipliers

Architecture	# Parameters	MFLOPs	Top-1 / Top-5 Accuracy (%)
MobileNetV2 1.0	3.504M	300.79	72.192 / 90.534
MobileNetV2 0.75	2.636M	209.08	69.952 / 88.986
MobileNetV2 0.5	1.968M	97.14	64.592 / 85.392
MobileNetV2 0.35	1.677M	59.29	60.092 / 82.172
MobileNetV2 0.25	1.519M	37.21	52.352 / 75.932
MobileNetV2 0.1	1.356M	12.92	34.896 / 56.564

MobileNetV2 1.0 with a spectrum of input resolutions

Architecture	# Parameters	MFLOPs	Top-1 / Top-5 Accuracy (%)
MobileNetV2 224x224	3.504M	300.79	72.192 / 90.534
MobileNetV2 192x192	3.504M	221.33	71.076 / 89.760
MobileNetV2 160x160	3.504M	154.10	69.504 / 88.848
MobileNetV2 128x128	3.504M	99.09	66.740 / 86.952
MobileNetV2 96x96	3.504M	56.31	62.696 / 84.046

Taking MobileNetV2 1.0 as an example, pretrained models can be easily imported using the following lines and then finetuned for other vision tasks or utilized in resource-aware platforms.

from models.imagenet import mobilenetv2

net = mobilenetv2()
net.load_state_dict(torch.load('pretrained/mobilenetv2-c5e733a8.pth'))

Usage

Training

Configuration to reproduce our strong results efficiently, consuming around 2 days on 4x TiTan XP GPUs with non-distributed DataParallel and PyTorch dataloader.

batch size 256
epoch 150
learning rate 0.05
LR decay strategy cosine
weight decay 0.00004

The newly released model achieves even higher accuracy, with larger bacth size (1024) on 8 GPUs, higher initial learning rate (0.4) and longer training epochs (250). In addition, a dropout layer with the dropout rate of 0.2 is inserted before the final FC layer, no weight decay is imposed on biases and BN layers and the learning rate ramps up from 0.1 to 0.4 in the first five training epochs.

python imagenet.py \
    -a mobilenetv2 \
    -d <path-to-ILSVRC2012-data> \
    --epochs 150 \
    --lr-decay cos \
    --lr 0.05 \
    --wd 4e-5 \
    -c <path-to-save-checkpoints> \
    --width-mult <width-multiplier> \
    --input-size <input-resolution> \
    -j <num-workers>

Test

python imagenet.py \
    -a mobilenetv2 \
    -d <path-to-ILSVRC2012-data> \
    --weight <pretrained-pth-file> \
    --width-mult <width-multiplier> \
    --input-size <input-resolution> \
    -e

Citations

The following is a BibTeX entry for the MobileNet V2 paper that you should cite if you use this model.

@InProceedings{Sandler_2018_CVPR,
author = {Sandler, Mark and Howard, Andrew and Zhu, Menglong and Zhmoginov, Andrey and Chen, Liang-Chieh},
title = {MobileNetV2: Inverted Residuals and Linear Bottlenecks},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

If you find this implementation helpful in your research, please also consider citing:

@InProceedings{Li_2019_ICCV,
author = {Li, Duo and Zhou, Aojun and Yao, Anbang},
title = {HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2019}
}

License

This repository is licensed under the Apache License 2.0.

mobilenetv2.pytorch's People

Contributors

Stargazers

Watchers

Forkers

qihuacheng zxwu sunasity donproc ww-zwj hibiscuses joeywzr scrapinghub achaiah tiantian-han leerock alwc oldpan xiaoye77 aiyangyang963 qwerbbbb dilinwang820 mbuckler ninazizi wsf1297139301 amose-yao nimz iamweiweishi greenfigo2015 a1aaaaagithub aodamiaomiao dedsec-xu ingeniousfrog zhoudaquan jonathanbonnard mykameli huqinghao hsj307 oindrilasaha apxlwl liwanning wuqiman zhly0 curiouscat-7 ryanme20 liuguoyou cris-zj fishman2008 banxia1994 undercontroller ailihong litl95 ljdongysu naviocean callmedxx zheyitu1001 siemens-aopen dale610 jiaminglin hanson-young liyantett nicolewang crystalsixone hexuanfang lihuinb caogaofeng shunlu91 nguyenthean wangq95 crazychickendev lph529372693 lzc06 lyogavin cv-ip chenyang918 kyhoolee fnhdx richexplor argusswift 64327069 xuarehere haiminzhang hailuo0112 sovrasov w2020 note-liu luoqing94 biubiu0719 momotyust li-yingjiao jtiger0431 qboulanger wellxiong arui1 lightb0x arch-devil kimdo-765 zyzzu leon-liangwu sunqian101 mx1mx2 subhailams yinhao1501 ciwei123 hhhhnwl

mobilenetv2.pytorch's Issues

Question about normalization in data preprocessing

Hi,
Thanks for your wonderful work!
I find the normalization in your codes does not use ``transforms.ToTensor(), transforms.Normalize()'' provided by PyTorch. The normalization you used is to sub and div on [0, 255] space directly.
I wonder why you use this normalization, dose it make big difference to the results?

DALI-dataloader

I wonder if you have updated the DALI dataloader since last push? If not, I'll send a PR that makes fixed dali dataloader for newer versions of dali/nvidia stuff.

Number of epochs ?

For how many epochs you trained the network to get 72% accuracy ?

mobilenetv2_0.1-7d1d638a.pth can not be loaded.

Thank you for your great work!
However, some problems occur when I load mobilenetv2_0.1-7d1d638a.pth to your mobile net with width _mult = 0.1.
Could you tell me the width_mult I should set?

training log

Hi, can you share your training log please?

Embedding size

Hello, I found one interesting thing: The last layers in your state_dicts have shape 1280. I think they should change shape according to width_mult, but all checkpoints have shape 1280.

name_of_layer, shape
conv.0.weight torch.Size([1280, 160, 1, 1])
conv.1.weight torch.Size([1280])
conv.1.bias torch.Size([1280])
conv.1.running_mean torch.Size([1280])
conv.1.running_var torch.Size([1280])
conv.1.num_batches_tracked torch.Size([])
classifier.weight torch.Size([1000, 1280])
classifier.bias torch.Size([1000])

训练参数

你好，能公布下你的训练参数么，按照脚本给出的用于训练ResNet的默认参数并不能很好的训练mobilenetv2，所以想参考下你的设置

MobileNet v2 training options

Could you please kindly share your training options for MobileNet v2 on ImageNet when top1 accuracy finally reaches 72.0%? Thanks a lot!

bias present in pretrained network

Hi,
I don't understand why there are biases in your models. In model definition, conv2d has bias set to false :
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False)
Is this normal?

Validation Input size

Would you please clarify what is the input size during the validation? From the code, as it is defined in function get_pytorch_val_loader in the file utils/dataloaders.py, it seems validation input size is 224/0.875 = 256:
transforms.Resize(int(input_size / 0.875))
I believe in the paper, results are reported for the central crop of size 224x224, isn't it? So comparison is not straightforward. Can you please add the results on 224x224 crops?

Load the trained model and report an error when testing separately

When I use your following code to load the model：
source_state = torch.load(args.weight)
target_state = OrderedDict()
for k, v in source_state.items():
if k[:7] != 'module.':
k = 'module.' + k
target_state[k] = v
model.load_state_dict(target_state)

Report an error：

Does not run in Pytorch 1.3.1

I just cloned your repo and when I'm launching the command:

CUDA_VISIBLE_DEVICES=2,3,4,5 python imagenet.py -a mobilenetv2 -d /path/to/dataset/ImageNet2012/ --epochs 150 --lr-decay cos --lr 0.05 --wd 4e-5 -c checkpoints --width-mult 1 --input-size 224 -j 12

It gets stuck at this point:

=> creating model 'mobilenetv2'

Epoch: [1 | 150]
Processing

<Ctrl+C pressed after 10 min of nothing happening:>

^CTraceback (most recent call last):
  File "imagenet.py", line 403, in <module>
    main()
  File "imagenet.py", line 224, in main
    train_loss, train_acc = train(train_loader, train_loader_len, model, criterion, optimizer, epoch)
  File "imagenet.py", line 271, in train
    for i, (input, target) in enumerate(train_loader):
  File "/home/michael/mobilenetv2.pytorch/utils/dataloaders.py", line 190, in prefetched_loader
    for next_input, next_target in loader:
  File "/home/michael/miniconda2/envs/pt/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 804, in __next__
    idx, data = self._get_data()
  File "/home/michael/miniconda2/envs/pt/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 761, in _get_data
    success, data = self._try_get_data()
  File "/home/michael/miniconda2/envs/pt/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/home/michael/miniconda2/envs/pt/lib/python3.7/queue.py", line 179, in get
    self.not_empty.wait(remaining)
  File "/home/michael/miniconda2/envs/pt/lib/python3.7/threading.py", line 300, in wait
    gotit = waiter.acquire(True, timeout)
KeyboardInterrupt

Nothing is happening at this point. nvidia-smi shows that a single GPU consumes ~500M of memory, and CPU cores are ~60% busy, but it's not clear what are they doing. I waited for 10 minutes before aborting. I also tried it on a single GPU - same issue.
If I switch to --data-backend dali-cpu (using nvidia-dali version 0.16) it fails with the following error:

=> creating model 'mobilenetv2' Traceback (most recent call last): File "imagenet.py", line 403, in <module> main() File "imagenet.py", line 194, in main train_loader, train_loader_len = get_train_loader(args.data, args.batch_size, workers=args.workers, input_size=args.input_size) TypeError: gdtl() got an unexpected keyword argument 'input_size'

I'm using Pytorch 1.3.1 with 4x Titan Xp cards. The only thing I had to change in your code is to replace cuda(async=True) with cuda(non_blocking=True). Changing tonon_blocking=False does not help.

Can you please try cloning your repo to a clean Pytorch 1.3.1 environment and see if you can run it? Any idea what's going on?

how to use the pretrained 0.75 model

hi, here!

first of all, many thanks for your work! It helps a lot for me!

My question is that how could I use your pretrained mobilenetv2_0.75 model? I looked into the model, the channels change from [32 16 24 32 64 96 160 320] to [24 16 24 24 48 72 120 240]. Does this mean as long as I change the input_channel to 24
and
self.cfgs = [
# t, c, n, s
[1, 16, 1, 1],
[6, 24, 2, 2],
[6, 32, 3, 2],
[6, 64, 4, 2],
[6, 96, 3, 1],
[6, 160, 3, 2],
[6, 320, 1, 1],
]
to
self.cfgs = [
# t, c, n, s
[1, 16, 1, 1],
[6, 24, 2, 2],
[6, 24, 3, 2],
[6, 48, 4, 2],
[6, 72, 3, 1],
[6, 120, 3, 2],
[6, 240, 1, 1],
]
everything is ok?

want to get some advice for training.

mobilenetv2 1.0 224
epochs: 200
bacth size: 512
lr-decay: cos
lr : 0.2
wd: 4e-5
wamps up: 0-0.4 first 5 epochs
dropout : 0.2

top1 : 0.685
but In your md_file ,it is 0.722
is something wrong in my training?

I didn't find the implementation of the 'linear2exp' you mentioned before

Several months ago you answered a question with your training command. But I didn't find the implementation of the '--lr-decay="linear2exp"' in the function 'adjust_learning_rate'.

I trained MobileNet V2 from scratch by calling
python3 imagenet.py -d /path/to/your/ImageNet/root/ -j16 --epochs=300 --arch="mobilenetv2" --gpu-id="0,1,2,3" --lr=0.045 --lr-decay="linear2exp" --gamma=0.98 --weight-decay=0.00004

Now, the best model achieves 71.79% top-1 accuracy at Epoch 249. The training is expected to finish in a few days. Hardware environment: 4-way 2080ti, Software environment: Pytorch 1.0.

Originally posted by @lld533 in #2 (comment)

Transformations on the input tensor

Hi, thank you for sharing the models. Can you please share transformations that are needed to be applied to the input tensor to get the correct results?

Question: Is it possible to use this arquitecture as a decoder?

Hi,

I am wondering how could this architecture be used as a generator/decoder of images. For example, to transform a vector [1028,1,1] into an image [3,64,64].

Thanks a lot in advance!