Giter Club home page Giter Club logo

gen-efficientnet-pytorch's Introduction

(Generic) EfficientNets for PyTorch

A 'generic' implementation of EfficientNet, MixNet, MobileNetV3, etc. that covers most of the compute/parameter efficient architectures derived from the MobileNet V1/V2 block sequence, including those found via automated neural architecture search.

All models are implemented by GenEfficientNet or MobileNetV3 classes, with string based architecture definitions to configure the block layouts (idea from here)

What's New

Aug 19, 2020

  • Add updated PyTorch trained EfficientNet-B3 weights trained by myself with timm (82.1 top-1)
  • Add PyTorch trained EfficientNet-Lite0 contributed by @hal-314 (75.5 top-1)
  • Update ONNX and Caffe2 export / utility scripts to work with latest PyTorch / ONNX
  • ONNX runtime based validation script added
  • activations (mostly) brought in sync with timm equivalents

April 5, 2020

  • Add some newly trained MobileNet-V2 models trained with latest h-params, rand augment. They compare quite favourably to EfficientNet-Lite
    • 3.5M param MobileNet-V2 100 @ 73%
    • 4.5M param MobileNet-V2 110d @ 75%
    • 6.1M param MobileNet-V2 140 @ 76.5%
    • 5.8M param MobileNet-V2 120d @ 77.3%

March 23, 2020

  • Add EfficientNet-Lite models w/ weights ported from Tensorflow TPU
  • Add PyTorch trained MobileNet-V3 Large weights with 75.77% top-1
  • IMPORTANT CHANGE (if training from scratch) - weight init changed to better match Tensorflow impl, set fix_group_fanout=False in initialize_weight_goog for old behavior

Feb 12, 2020

  • Add EfficientNet-L2 and B0-B7 NoisyStudent weights ported from Tensorflow TPU
  • Port new EfficientNet-B8 (RandAugment) weights from TF TPU, these are different than the B8 AdvProp, different input normalization.
  • Add RandAugment PyTorch trained EfficientNet-ES (EdgeTPU-Small) weights with 78.1 top-1. Trained by Andrew Lavin

Jan 22, 2020

  • Update weights for EfficientNet B0, B2, B3 and MixNet-XL with latest RandAugment trained weights. Trained with (https://github.com/rwightman/pytorch-image-models)
  • Fix torchscript compatibility for PyTorch 1.4, add torchscript support for MixedConv2d using ModuleDict
  • Test models, torchscript, onnx export with PyTorch 1.4 -- no issues

Nov 22, 2019

  • New top-1 high! Ported official TF EfficientNet AdvProp (https://arxiv.org/abs/1911.09665) weights and B8 model spec. Created a new set of ap models since they use a different preprocessing (Inception mean/std) from the original EfficientNet base/AA/RA weights.

Nov 15, 2019

  • Ported official TF MobileNet-V3 float32 large/small/minimalistic weights
  • Modifications to MobileNet-V3 model and components to support some additional config needed for differences between TF MobileNet-V3 and mine

Oct 30, 2019

  • Many of the models will now work with torch.jit.script, MixNet being the biggest exception
  • Improved interface for enabling torchscript or ONNX export compatible modes (via config)
  • Add JIT optimized mem-efficient Swish/Mish autograd.fn in addition to memory-efficient autgrad.fn
  • Activation factory to select best version of activation by name or override one globally
  • Add pretrained checkpoint load helper that handles input conv and classifier changes

Oct 27, 2019

Models

Implemented models include:

I originally implemented and trained some these models with code here, this repository contains just the GenEfficientNet models, validation, and associated ONNX/Caffe2 export code.

Pretrained

I've managed to train several of the models to accuracies close to or above the originating papers and official impl. My training code is here: https://github.com/rwightman/pytorch-image-models

Model Prec@1 (Err) Prec@5 (Err) Param#(M) MAdds(M) Image Scaling Resolution Crop
efficientnet_b3 82.240 (17.760) 96.116 (3.884) 12.23 TBD bicubic 320 1.0
efficientnet_b3 82.076 (17.924) 96.020 (3.980) 12.23 TBD bicubic 300 0.904
mixnet_xl 81.074 (18.926) 95.282 (4.718) 11.90 TBD bicubic 256 1.0
efficientnet_b2 80.612 (19.388) 95.318 (4.682) 9.1 TBD bicubic 288 1.0
mixnet_xl 80.476 (19.524) 94.936 (5.064) 11.90 TBD bicubic 224 0.875
efficientnet_b2 80.288 (19.712) 95.166 (4.834) 9.1 1003 bicubic 260 0.890
mixnet_l 78.976 (21.024 94.184 (5.816) 7.33 TBD bicubic 224 0.875
efficientnet_b1 78.692 (21.308) 94.086 (5.914) 7.8 694 bicubic 240 0.882
efficientnet_es 78.066 (21.934) 93.926 (6.074) 5.44 TBD bicubic 224 0.875
efficientnet_b0 77.698 (22.302) 93.532 (6.468) 5.3 390 bicubic 224 0.875
mobilenetv2_120d 77.294 (22.706 93.502 (6.498) 5.8 TBD bicubic 224 0.875
mixnet_m 77.256 (22.744) 93.418 (6.582) 5.01 353 bicubic 224 0.875
mobilenetv2_140 76.524 (23.476) 92.990 (7.010) 6.1 TBD bicubic 224 0.875
mixnet_s 75.988 (24.012) 92.794 (7.206) 4.13 TBD bicubic 224 0.875
mobilenetv3_large_100 75.766 (24.234) 92.542 (7.458) 5.5 TBD bicubic 224 0.875
mobilenetv3_rw 75.634 (24.366) 92.708 (7.292) 5.5 219 bicubic 224 0.875
efficientnet_lite0 75.472 (24.528) 92.520 (7.480) 4.65 TBD bicubic 224 0.875
mnasnet_a1 75.448 (24.552) 92.604 (7.396) 3.9 312 bicubic 224 0.875
fbnetc_100 75.124 (24.876) 92.386 (7.614) 5.6 385 bilinear 224 0.875
mobilenetv2_110d 75.052 (24.948) 92.180 (7.820) 4.5 TBD bicubic 224 0.875
mnasnet_b1 74.658 (25.342) 92.114 (7.886) 4.4 315 bicubic 224 0.875
spnasnet_100 74.084 (25.916) 91.818 (8.182) 4.4 TBD bilinear 224 0.875
mobilenetv2_100 72.978 (27.022) 91.016 (8.984) 3.5 TBD bicubic 224 0.875

More pretrained models to come...

Ported Weights

The weights ported from Tensorflow checkpoints for the EfficientNet models do pretty much match accuracy in Tensorflow once a SAME convolution padding equivalent is added, and the same crop factors, image scaling, etc (see table) are used via cmd line args.

IMPORTANT:

  • Tensorflow ported weights for EfficientNet AdvProp (AP), EfficientNet EdgeTPU, EfficientNet-CondConv, EfficientNet-Lite, and MobileNet-V3 models use Inception style (0.5, 0.5, 0.5) for mean and std.
  • Enabling the Tensorflow preprocessing pipeline with --tf-preprocessing at validation time will improve scores by 0.1-0.5%, very close to original TF impl.

To run validation for tf_efficientnet_b5: python validate.py /path/to/imagenet/validation/ --model tf_efficientnet_b5 -b 64 --img-size 456 --crop-pct 0.934 --interpolation bicubic

To run validation w/ TF preprocessing for tf_efficientnet_b5: python validate.py /path/to/imagenet/validation/ --model tf_efficientnet_b5 -b 64 --img-size 456 --tf-preprocessing

To run validation for a model with Inception preprocessing, ie EfficientNet-B8 AdvProp: python validate.py /path/to/imagenet/validation/ --model tf_efficientnet_b8_ap -b 48 --num-gpu 2 --img-size 672 --crop-pct 0.954 --mean 0.5 --std 0.5

Model Prec@1 (Err) Prec@5 (Err) Param # Image Scaling Image Size Crop
tf_efficientnet_l2_ns *tfp 88.352 (11.648) 98.652 (1.348) 480 bicubic 800 N/A
tf_efficientnet_l2_ns TBD TBD 480 bicubic 800 0.961
tf_efficientnet_l2_ns_475 88.234 (11.766) 98.546 (1.454) 480 bicubic 475 0.936
tf_efficientnet_l2_ns_475 *tfp 88.172 (11.828) 98.566 (1.434) 480 bicubic 475 N/A
tf_efficientnet_b7_ns *tfp 86.844 (13.156) 98.084 (1.916) 66.35 bicubic 600 N/A
tf_efficientnet_b7_ns 86.840 (13.160) 98.094 (1.906) 66.35 bicubic 600 N/A
tf_efficientnet_b6_ns 86.452 (13.548) 97.882 (2.118) 43.04 bicubic 528 N/A
tf_efficientnet_b6_ns *tfp 86.444 (13.556) 97.880 (2.120) 43.04 bicubic 528 N/A
tf_efficientnet_b5_ns *tfp 86.064 (13.936) 97.746 (2.254) 30.39 bicubic 456 N/A
tf_efficientnet_b5_ns 86.088 (13.912) 97.752 (2.248) 30.39 bicubic 456 N/A
tf_efficientnet_b8_ap *tfp 85.436 (14.564) 97.272 (2.728) 87.4 bicubic 672 N/A
tf_efficientnet_b8 *tfp 85.384 (14.616) 97.394 (2.606) 87.4 bicubic 672 N/A
tf_efficientnet_b8 85.370 (14.630) 97.390 (2.610) 87.4 bicubic 672 0.954
tf_efficientnet_b8_ap 85.368 (14.632) 97.294 (2.706) 87.4 bicubic 672 0.954
tf_efficientnet_b4_ns *tfp 85.298 (14.702) 97.504 (2.496) 19.34 bicubic 380 N/A
tf_efficientnet_b4_ns 85.162 (14.838) 97.470 (2.530) 19.34 bicubic 380 0.922
tf_efficientnet_b7_ap *tfp 85.154 (14.846) 97.244 (2.756) 66.35 bicubic 600 N/A
tf_efficientnet_b7_ap 85.118 (14.882) 97.252 (2.748) 66.35 bicubic 600 0.949
tf_efficientnet_b7 *tfp 84.940 (15.060) 97.214 (2.786) 66.35 bicubic 600 N/A
tf_efficientnet_b7 84.932 (15.068) 97.208 (2.792) 66.35 bicubic 600 0.949
tf_efficientnet_b6_ap 84.786 (15.214) 97.138 (2.862) 43.04 bicubic 528 0.942
tf_efficientnet_b6_ap *tfp 84.760 (15.240) 97.124 (2.876) 43.04 bicubic 528 N/A
tf_efficientnet_b5_ap *tfp 84.276 (15.724) 96.932 (3.068) 30.39 bicubic 456 N/A
tf_efficientnet_b5_ap 84.254 (15.746) 96.976 (3.024) 30.39 bicubic 456 0.934
tf_efficientnet_b6 *tfp 84.140 (15.860) 96.852 (3.148) 43.04 bicubic 528 N/A
tf_efficientnet_b6 84.110 (15.890) 96.886 (3.114) 43.04 bicubic 528 0.942
tf_efficientnet_b3_ns *tfp 84.054 (15.946) 96.918 (3.082) 12.23 bicubic 300 N/A
tf_efficientnet_b3_ns 84.048 (15.952) 96.910 (3.090) 12.23 bicubic 300 .904
tf_efficientnet_b5 *tfp 83.822 (16.178) 96.756 (3.244) 30.39 bicubic 456 N/A
tf_efficientnet_b5 83.812 (16.188) 96.748 (3.252) 30.39 bicubic 456 0.934
tf_efficientnet_b4_ap *tfp 83.278 (16.722) 96.376 (3.624) 19.34 bicubic 380 N/A
tf_efficientnet_b4_ap 83.248 (16.752) 96.388 (3.612) 19.34 bicubic 380 0.922
tf_efficientnet_b4 83.022 (16.978) 96.300 (3.700) 19.34 bicubic 380 0.922
tf_efficientnet_b4 *tfp 82.948 (17.052) 96.308 (3.692) 19.34 bicubic 380 N/A
tf_efficientnet_b2_ns *tfp 82.436 (17.564) 96.268 (3.732) 9.11 bicubic 260 N/A
tf_efficientnet_b2_ns 82.380 (17.620) 96.248 (3.752) 9.11 bicubic 260 0.89
tf_efficientnet_b3_ap *tfp 81.882 (18.118) 95.662 (4.338) 12.23 bicubic 300 N/A
tf_efficientnet_b3_ap 81.828 (18.172) 95.624 (4.376) 12.23 bicubic 300 0.904
tf_efficientnet_b3 81.636 (18.364) 95.718 (4.282) 12.23 bicubic 300 0.904
tf_efficientnet_b3 *tfp 81.576 (18.424) 95.662 (4.338) 12.23 bicubic 300 N/A
tf_efficientnet_lite4 81.528 (18.472) 95.668 (4.332) 13.00 bilinear 380 0.92
tf_efficientnet_b1_ns *tfp 81.514 (18.486) 95.776 (4.224) 7.79 bicubic 240 N/A
tf_efficientnet_lite4 *tfp 81.502 (18.498) 95.676 (4.324) 13.00 bilinear 380 N/A
tf_efficientnet_b1_ns 81.388 (18.612) 95.738 (4.262) 7.79 bicubic 240 0.88
tf_efficientnet_el 80.534 (19.466) 95.190 (4.810) 10.59 bicubic 300 0.904
tf_efficientnet_el *tfp 80.476 (19.524) 95.200 (4.800) 10.59 bicubic 300 N/A
tf_efficientnet_b2_ap *tfp 80.420 (19.580) 95.040 (4.960) 9.11 bicubic 260 N/A
tf_efficientnet_b2_ap 80.306 (19.694) 95.028 (4.972) 9.11 bicubic 260 0.890
tf_efficientnet_b2 *tfp 80.188 (19.812) 94.974 (5.026) 9.11 bicubic 260 N/A
tf_efficientnet_b2 80.086 (19.914) 94.908 (5.092) 9.11 bicubic 260 0.890
tf_efficientnet_lite3 79.812 (20.188) 94.914 (5.086) 8.20 bilinear 300 0.904
tf_efficientnet_lite3 *tfp 79.734 (20.266) 94.838 (5.162) 8.20 bilinear 300 N/A
tf_efficientnet_b1_ap *tfp 79.532 (20.468) 94.378 (5.622) 7.79 bicubic 240 N/A
tf_efficientnet_cc_b1_8e *tfp 79.464 (20.536) 94.492 (5.508) 39.7 bicubic 240 0.88
tf_efficientnet_cc_b1_8e 79.298 (20.702) 94.364 (5.636) 39.7 bicubic 240 0.88
tf_efficientnet_b1_ap 79.278 (20.722) 94.308 (5.692) 7.79 bicubic 240 0.88
tf_efficientnet_b1 *tfp 79.172 (20.828) 94.450 (5.550) 7.79 bicubic 240 N/A
tf_efficientnet_em *tfp 78.958 (21.042) 94.458 (5.542) 6.90 bicubic 240 N/A
tf_efficientnet_b0_ns *tfp 78.806 (21.194) 94.496 (5.504) 5.29 bicubic 224 N/A
tf_mixnet_l *tfp 78.846 (21.154) 94.212 (5.788) 7.33 bilinear 224 N/A
tf_efficientnet_b1 78.826 (21.174) 94.198 (5.802) 7.79 bicubic 240 0.88
tf_mixnet_l 78.770 (21.230) 94.004 (5.996) 7.33 bicubic 224 0.875
tf_efficientnet_em 78.742 (21.258) 94.332 (5.668) 6.90 bicubic 240 0.875
tf_efficientnet_b0_ns 78.658 (21.342) 94.376 (5.624) 5.29 bicubic 224 0.875
tf_efficientnet_cc_b0_8e *tfp 78.314 (21.686) 93.790 (6.210) 24.0 bicubic 224 0.875
tf_efficientnet_cc_b0_8e 77.908 (22.092) 93.656 (6.344) 24.0 bicubic 224 0.875
tf_efficientnet_cc_b0_4e *tfp 77.746 (22.254) 93.552 (6.448) 13.3 bicubic 224 0.875
tf_efficientnet_cc_b0_4e 77.304 (22.696) 93.332 (6.668) 13.3 bicubic 224 0.875
tf_efficientnet_es *tfp 77.616 (22.384) 93.750 (6.250) 5.44 bicubic 224 N/A
tf_efficientnet_lite2 *tfp 77.544 (22.456) 93.800 (6.200) 6.09 bilinear 260 N/A
tf_efficientnet_lite2 77.460 (22.540) 93.746 (6.254) 6.09 bicubic 260 0.89
tf_efficientnet_b0_ap *tfp 77.514 (22.486) 93.576 (6.424) 5.29 bicubic 224 N/A
tf_efficientnet_es 77.264 (22.736) 93.600 (6.400) 5.44 bicubic 224 N/A
tf_efficientnet_b0 *tfp 77.258 (22.742) 93.478 (6.522) 5.29 bicubic 224 N/A
tf_efficientnet_b0_ap 77.084 (22.916) 93.254 (6.746) 5.29 bicubic 224 0.875
tf_mixnet_m *tfp 77.072 (22.928) 93.368 (6.632) 5.01 bilinear 224 N/A
tf_mixnet_m 76.950 (23.050) 93.156 (6.844) 5.01 bicubic 224 0.875
tf_efficientnet_b0 76.848 (23.152) 93.228 (6.772) 5.29 bicubic 224 0.875
tf_efficientnet_lite1 *tfp 76.764 (23.236) 93.326 (6.674) 5.42 bilinear 240 N/A
tf_efficientnet_lite1 76.638 (23.362) 93.232 (6.768) 5.42 bicubic 240 0.882
tf_mixnet_s *tfp 75.800 (24.200) 92.788 (7.212) 4.13 bilinear 224 N/A
tf_mobilenetv3_large_100 *tfp 75.768 (24.232) 92.710 (7.290) 5.48 bilinear 224 N/A
tf_mixnet_s 75.648 (24.352) 92.636 (7.364) 4.13 bicubic 224 0.875
tf_mobilenetv3_large_100 75.516 (24.484) 92.600 (7.400) 5.48 bilinear 224 0.875
tf_efficientnet_lite0 *tfp 75.074 (24.926) 92.314 (7.686) 4.65 bilinear 224 N/A
tf_efficientnet_lite0 74.842 (25.158) 92.170 (7.830) 4.65 bicubic 224 0.875
tf_mobilenetv3_large_075 *tfp 73.730 (26.270) 91.616 (8.384) 3.99 bilinear 224 N/A
tf_mobilenetv3_large_075 73.442 (26.558) 91.352 (8.648) 3.99 bilinear 224 0.875
tf_mobilenetv3_large_minimal_100 *tfp 72.678 (27.322) 90.860 (9.140) 3.92 bilinear 224 N/A
tf_mobilenetv3_large_minimal_100 72.244 (27.756) 90.636 (9.364) 3.92 bilinear 224 0.875
tf_mobilenetv3_small_100 *tfp 67.918 (32.082) 87.958 (12.042 2.54 bilinear 224 N/A
tf_mobilenetv3_small_100 67.918 (32.082) 87.662 (12.338) 2.54 bilinear 224 0.875
tf_mobilenetv3_small_075 *tfp 66.142 (33.858) 86.498 (13.502) 2.04 bilinear 224 N/A
tf_mobilenetv3_small_075 65.718 (34.282) 86.136 (13.864) 2.04 bilinear 224 0.875
tf_mobilenetv3_small_minimal_100 *tfp 63.378 (36.622) 84.802 (15.198) 2.04 bilinear 224 N/A
tf_mobilenetv3_small_minimal_100 62.898 (37.102) 84.230 (15.770) 2.04 bilinear 224 0.875

*tfp models validated with tf-preprocessing pipeline

Google tf and tflite weights ported from official Tensorflow repositories

Usage

Environment

All development and testing has been done in Conda Python 3 environments on Linux x86-64 systems, specifically Python 3.6.x, 3.7.x, 3.8.x.

Users have reported that a Python 3 Anaconda install in Windows works. I have not verified this myself.

PyTorch versions 1.4, 1.5, 1.6 have been tested with this code.

I've tried to keep the dependencies minimal, the setup is as per the PyTorch default install instructions for Conda:

conda create -n torch-env
conda activate torch-env
conda install -c pytorch pytorch torchvision cudatoolkit=10.2

PyTorch Hub

Models can be accessed via the PyTorch Hub API

>>> torch.hub.list('rwightman/gen-efficientnet-pytorch')
['efficientnet_b0', ...]
>>> model = torch.hub.load('rwightman/gen-efficientnet-pytorch', 'efficientnet_b0', pretrained=True)
>>> model.eval()
>>> output = model(torch.randn(1,3,224,224))

Pip

This package can be installed via pip.

Install (after conda env/install):

pip install geffnet

Eval use:

>>> import geffnet
>>> m = geffnet.create_model('mobilenetv3_large_100', pretrained=True)
>>> m.eval()

Train use:

>>> import geffnet
>>> # models can also be created by using the entrypoint directly
>>> m = geffnet.efficientnet_b2(pretrained=True, drop_rate=0.25, drop_connect_rate=0.2)
>>> m.train()

Create in a nn.Sequential container, for fast.ai, etc:

>>> import geffnet
>>> m = geffnet.mixnet_l(pretrained=True, drop_rate=0.25, drop_connect_rate=0.2, as_sequential=True)

Exporting

Scripts are included to

  • export models to ONNX (onnx_export.py)
  • optimized ONNX graph (onnx_optimize.py or onnx_validate.py w/ --onnx-output-opt arg)
  • validate with ONNX runtime (onnx_validate.py)
  • convert ONNX model to Caffe2 (onnx_to_caffe.py)
  • validate in Caffe2 (caffe2_validate.py)
  • benchmark in Caffe2 w/ FLOPs, parameters output (caffe2_benchmark.py)

As an example, to export the MobileNet-V3 pretrained model and then run an Imagenet validation:

python onnx_export.py --model mobilenetv3_large_100 ./mobilenetv3_100.onnx
python onnx_validate.py /imagenet/validation/ --onnx-input ./mobilenetv3_100.onnx 

These scripts were tested to be working as of PyTorch 1.6 and ONNX 1.7 w/ ONNX runtime 1.4. Caffe2 compatible export now requires additional args mentioned in the export script (not needed in earlier versions).

Export Notes

  1. The TF ported weights with the 'SAME' conv padding activated cannot be exported to ONNX unless _EXPORTABLE flag in config.py is set to True. Use config.set_exportable(True) as in the onnx_export.py script.
  2. TF ported models with 'SAME' padding will have the padding fixed at export time to the resolution used for export. Even though dynamic padding is supported in opset >= 11, I can't get it working.
  3. ONNX optimize facility doesn't work reliably in PyTorch 1.6 / ONNX 1.7. Fortunately, the onnxruntime based inference is working very well now and includes on the fly optimization.
  4. ONNX / Caffe2 export/import frequently breaks with different PyTorch and ONNX version releases. Please check their respective issue trackers before filing issues here.

gen-efficientnet-pytorch's People

Contributors

alexeyab avatar aovoc avatar jzcruiser avatar rwightman avatar vishwesh5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gen-efficientnet-pytorch's Issues

Cannot export an ONNX file with PyTorch 1.7.0

I downloaded models via the PyTorch Hub API and exported ONNX file well on a few weeks ago.
But now, I cannot do that.

The geffnet 3edcd4d, which was commited a week ago, uses SiLU activation with PyTorch 1.7.0, but recent ONNX opset 12 hasn't supported SiLU yet.
Therefore torch.onnx.export() of PyTorch 1.7.0 results in following error.

import torch

model = torch.hub.load(
    'rwightman/gen-efficientnet-pytorch', 'tf_efficientnet_b0',
    pretrained=True, exportable=True, verbose=False)
model.eval()
dummy_input = torch.randn(1,3,224,224)
model(dummy_input)
traced_model = torch.jit.trace(model, dummy_input)
torch_out = torch.onnx.export(
    model, dummy_input, 'tf_efficientnet_b0.onnx',
    export_params=True, verbose=False,opset_version=12)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 230, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 91, in export
    use_external_data_format=use_external_data_format)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 639, in _export
    dynamic_axes=dynamic_axes)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 421, in _model_to_graph
    dynamic_axes=dynamic_axes, input_names=input_names)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 203, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 263, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 930, in _run_symbolic_function
    symbolic_fn = _find_symbolic_in_registry(domain, op_name, opset_version, operator_export_type)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 888, in _find_symbolic_in_registry
    return sym_registry.get_registered_op(op_name, domain, opset_version)
  File "/home/k-tanabe/.local/lib/python3.6/site-packages/torch/onnx/symbolic_registry.py", line 111, in get_registered_op
    raise RuntimeError(msg)
RuntimeError: Exporting the operator silu to ONNX opset version 12 is not supported. Please open a bug to request ONNX export support for the missing operator

You have updated also geffnet/version.py from 1.0.0 to 1.0.1 in 3edcd4d.
Could you tag v1.0.0 on the previous revision e84f554, which supports ONNX file export with PyTorch 1.7.0.
Then I can down load v1.0.0 with a following command.

model = torch.hub.load(
    'rwightman/gen-efficientnet-pytorch:v1.0.0', 'tf_efficientnet_b0',
    pretrained=True, exportable=True, verbose=False)

customize efficientnet

Hi, i want to use efficientnet_pytorch.from_name(...) function in gen-efficientnet-pytorch.

from efficientnet_pytorch import EfficientNet
import torch
import torch.nn as nn

class LyftModel(nn.Module):
    def __init__(self, cfg: Dict):
        super(LyftModel, self).__init__()

        self.backbone = EfficientNet.from_name("efficientnet-b1")
        num_history_channels = (cfg["model_params"]["history_num_frames"] + 1) * 2
        num_in_channels = 3 + num_history_channels
        num_targets = 2 * cfg["model_params"]["future_num_frames"]
        self.backbone._conv_stem = nn.Conv2d(
            num_in_channels,
            self.backbone._conv_stem.out_channels,
            kernel_size=self.backbone._conv_stem.kernel_size,
            stride=self.backbone._conv_stem.stride,
            padding=self.backbone._conv_stem.padding,
            bias=False,
        )
        self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))
        self.fea_bn = nn.BatchNorm1d(1280)
        self.fea_bn.bias.requires_grad_(False)
        self.binary_head = nn.Linear(1280, num_targets)
        self.dropout = nn.Dropout(p=0.2)
 

    def forward(self, x):

        img_feature = self.backbone.extract_features(x)
        img_feature = self.avg_pool(img_feature)
        img_feature = img_feature.view(img_feature.size(0), -1)
        fea = self.fea_bn(img_feature)
        fea = self.dropout(fea)
        output = self.binary_head(fea)

        return output

Thanks in advance.

Exporting trained model to ONNX without geffnet.config.set_exportable(True)?

Hi Ross,

First of all, thank you for the awesome repo, it has been wonderful to use this and pytorch-image-models so far. I just have a couple of questions about ONNX export.

  1. I (mistakenly) trained my EfficientNet model on the models found in pytorch-image-models and then realised that to export it to ONNX I needed to have
    a) used gen-efficientnet-pytorch and
    b) set the geffnet.config.set_exportable(True) beforehand as you have stated in your export script.

Is there any way I could perhaps still transfer the weights from a pytorch-image-models version of EfficientNet to an equivalent gen-efficientnet-pytorch version AND at the same time make it compatible with ONNX?

  1. Is there any chance you could also add ONNX export somehow to the pytorch-image-models versions of EfficientNet? My code base immensely prefers to work with the attributes and methods available for models in that repository compared with the models in geffnet

  2. Seconding the suggestion in issue #32 and would love to have the option for that for pytorch-image-models as well.

Thank you

Compound scaling differs from paper

Bringing this issue over from: tensorflow/tpu#738

I am also confused. The starting value for the gamma coefficient (resolution scaling) is 1.15 and the baseline image resolution is 224x224. So it makes sense that 1.150 = 1 would be the scaling factor for B0 as 224x1 = 224. EfficientNetB1 would therefore be 1.151 = 1.15 and make the resolution for B1 224x1.15 = 257.6, but in the code it shows the B1 resolution is 240, and B2 is 260. I am also confused with the other coefficients as in the code it has:

params_dict = {

(width_coefficient, depth_coefficient, resolution, dropout_rate)

'efficientnet-b0': (1.0, 1.0, 224, 0.2),
'efficientnet-b1': (1.0, 1.1, 240, 0.2),
'efficientnet-b2': (1.1, 1.2, 260, 0.3),
'efficientnet-b3': (1.2, 1.4, 300, 0.3),
'efficientnet-b4': (1.4, 1.8, 380, 0.4),
'efficientnet-b5': (1.6, 2.2, 456, 0.4),
'efficientnet-b6': (1.8, 2.6, 528, 0.5),
'efficientnet-b7': (2.0, 3.1, 600, 0.5),
'efficientnet-b8': (2.2, 3.6, 672, 0.5),
'efficientnet-l2': (4.3, 5.3, 800, 0.5),
}

why is ฮฒ (width) coefficient for B1 = 1.0? Shouldn't it be 1.11 = 1.1? And ฮฑ (depth) coefficient is 1.1 when in the paper it has ฮฑ as = 1.2.

Just hoping someone in this community could enlighten me

onnx error with different input size

I export efficientnet_b0 onnx and set the 640x640 size , it will happen the error
'''
import onnx
import onnx_tensorrt.backend as backend
import numpy as np
model = onnx.load("efficientnet_b0.onnx")
engine = backend.prepare(model, device='CUDA:0')
input_data = np.random.random(size=(1, 3, 640, 640)).astype(np.float32)
output_data = engine.run(input_data)[0]
print(output_data)
print(output_data.shape)
'''

[Error]
[TensorRT] ERROR: Parameter check failed at: ../builder/Network.cpp::addPoolingNd::500, condition: allDimsGtEq(windowSize, 1) && volume(windowSize) < MAX_KERNEL_DIMS_PRODUCT
Traceback (most recent call last):
File "test_onnx.py", line 7, in
engine = backend.prepare(model, device='CUDA:0')
File "/opt/conda/lib/python3.6/site-packages/onnx_tensorrt-0.1.0-py3.6-linux-x86_64.egg/onnx_tensorrt/backend.py", line 218, in prepare
return TensorRTBackendRep(model, device, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/onnx_tensorrt-0.1.0-py3.6-linux-x86_64.egg/onnx_tensorrt/backend.py", line 94, in init
raise RuntimeError(msg)
RuntimeError: While parsing node number 8:
builtin_op_importers.cpp:1175 In function importGlobalAveragePool:
[8] Assertion failed: layer_ptr

pretrained models for efficientnet_lite0

@rwightman hi, thanks for your work firstly, it's a great repo.
I want to get the pretrained model for gen-efficientent_lite0, but the url is 'none'. I found a link ''tf_efficientnet_lite0'', is it converted from tensorflow checkpoint? And, what does ' v0.1-weights' means? Can I download it directly and use it normally๏ผŸ

Using pretrained EfficientNet can't achieve the accuracy of original paper?

Thanks for your sharing of so many pretrained models.

I am using your EfficientNet-b0 to train on CIFAR-100 using pytorch.My best result was only about 60% acc. I have tried so many hyperparameters and tricks,but still fail to work well.So I have to ask for your kind help.

Here is my settings:
I used torch.hub to load your pretrained model,and changed out features of last classifier layer to 100.My initial lr was 0.001 used SGD to optimize and reducelronplateau lr scheduler.

It would be great to receive your reply.Have a good day!

Importing Tensorflow checkpoints

Can you complement the README.md with exact instructions on how to import Tensorflow checkpoints of EfficientNet models? I train EfficientNetB0-B4 models on TPUs now and it would be nice to import into your framework for evaluation. I love pytorch-image-models, it is very handy and I would like to compare the shipped tf_efficientnet_bX models with my new models.

Thanks!

only want to use mobilenetv3 model

only want to use mobilenetv3 model๏ผŒexample mobilenetv3.py is only model๏ผŒno creating model and load pretrain.Could you give me some advice?

Pretrained weights failing to load (EffNet-B5)?

Hi @rwightman,
Thanks first for this awesome repo. I'm trying to use your impl to get the AP pre-trained B5, but it's quite clearly failing to load the pretrained weights though with neither an error nor a confirm the weights were loaded. Is this a known issue or am I doing something wrong? (edit - ok I re-read the readme and think I misunderstood that AP implemented meant with pretrained weights available...anyway, if so then passing preTrained=True where no weights exist should ideally print a warning or error?)

1 - Installed via pip install geffnet
2 - Import geffnet
3 - model = geffnet.create_model('efficientnet_b5',num_classes = data.c,pretrained=True, drop_rate=0.2, drop_connect_rate=0.2)#, as_sequential=True)
Normally I'm used to seeing a "loading .pth and the progress bar here on a new instance, or a confirmation of weights loaded. I did not see either but no error either.
4 - When you go to train it becomes abundantly clear that it's working with a new init network (i.e. first epoch close to random, then verrry slow training progress. By contrast a pre-trained digs right in.

If possible, it would be great to get a confirmation message like in Melas impl once weights are loaded: "Loaded pretrained weights for efficientnet-b5" or if not a warning that it's a new network if pretrained=True was passed in?

Thanks much!
Less

Number of classes changed discards pretrained classifier

Hey @rwightman ,

I tried to change the number of classes for my finetuning project with 'tf_efficientnet_b4_ap' and for some reason, when I change the number of classes, i get the following output:
=> Discarding pretrained classifier since num_classes != 1000
but I think it shouldn't be discarded for num_classes != 1000 because it should be solved by something as simple as
model = geffnet.create_model(modelname, pretrained=True) model.classifier(nn.Sequential(nn.Linear(1792, num_classes)))
If this cannot be done, could you please explain why?
I would honestly like to run tf_efficientnet instead of regular ones as they do provide better results.

Also, the adv_prop normalization could adapt to other normalizations, right?

Thanks

Memory required for training EfficientNet b0-l2

Hi there

Is there a recommended amount of GPU memory needed to train EfficientNets b0 to l2? Currently have another implementation of EfficientNet-b8 but running into memory issues on a 24GB GPU. Will I have the same issues with this implementation?

TIA

I think tf_efficientnet was not trained.

I tested tf_efficientnet accuracy with ImageNet 2012 _VAL.

And that is result
b0 = 0.699
b1 = 0.717
b2 = 0.721
b3 = 0.724
b4 = 0.717
b5 = 0.619
b6 = 0.694
b7 = 0.523

Why b4b7 was get lower accuracy than b0b3?
I think that weight has problem.
Can you check that's weight?

efficientnet pretrained model URL is empty

When I use this code

model = geffnet.create_model(
'efficientnet_b5',
num_classes=1000,
in_chans=3,
pretrained=True,
exportable=True)

image

But this error occured.

efficientnet b0~b3 => okay

efficientnet b4~b7 => URL is empty.

How to solved this problem?

Unable to access to _SCRIPTABLE and _EXPORTABLE from torch.hub

Hi @rwightman!

Thank you for your great implementation for efficientnet and its relatives.

I found the global param is _SCRIPTABLE and _EXPORTABLE in the config file. However, when I use torch hub to access the model, it is hard for me to acces these two variables and successful export the model. In this case I'm wondering is that possible to set _SCRIPTABLE and _EXPORTABLE as a parameter of EfficientNetBuilder?

Best,

error with pytorch 1.3.1

After updating pytorch 1.3.0 -> 1.3.1
I got the following error

torch.hub.list('rwightman/gen-efficientnet-pytorch')

ImportError: cannot import name 'mobilenetv3_100' from 'geffnet' (/home/mario/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master/geffnet/__init__.py)

Converting tf_efficientnet_lite1 to ONNX

Is there a way around or a solution?

C:\Users\Kedar/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master\geffnet\conv2d_layers.py:39: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
return max((math.ceil(i / s) - 1) * s + (k - 1) * d + 1 - i, 0)
C:\Users\Kedar/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master\geffnet\conv2d_layers.py:39: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
return max((math.ceil(i / s) - 1) * s + (k - 1) * d + 1 - i, 0)
C:\Users\Kedar/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master\geffnet\conv2d_layers.py:63: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_h > 0 or pad_w > 0:
Traceback (most recent call last):
File "", line 4, in
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx_init_.py", line 148, in export
strip_doc_string, dynamic_axes, keep_initializers_as_inputs)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\utils.py", line 66, in export
dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\utils.py", line 416, in _export
fixed_batch_size=fixed_batch_size)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\utils.py", line 296, in _model_to_graph
fixed_batch_size=fixed_batch_size, params_dict=params_dict)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\utils.py", line 135, in _optimize_graph
graph = torch._C.jit_pass_onnx(graph, operator_export_type)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx_init
.py", line 179, in _run_symbolic_function
return utils._run_symbolic_function(*args, **kwargs)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\utils.py", line 657, in _run_symbolic_function
return op_fn(g, *inputs, **attrs)
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\symbolic_helper.py", line 128, in wrapper
args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\symbolic_helper.py", line 128, in
args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
File "C:\Users\Kedar.conda\envs\conv\lib\site-packages\torch\onnx\symbolic_helper.py", line 81, in _parse_arg
"', since it's not constant, please try to make "
RuntimeError: Failed to export an ONNX attribute 'onnx::Div', since it's not constant, please try to make things (e.g., kernel size) static if possible

Got error when import geffnet using Google Colab

When I import geffnet in Google Colab.
I got some error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-1dbdac5fd1e4> in <module>()
----> 1 import geffnet

3 frames
/usr/local/lib/python3.6/dist-packages/geffnet/layers.py in versiontuple(v)
      5 
      6 def versiontuple(v):
----> 7     return tuple(map(int, (v.split("."))))[:3]
      8 
      9 

ValueError: invalid literal for int() with base 10: '0+cu100'

I think it is because the torch.version string is '1.3.0+cu100' in Google Colab.
Maybe we can use some regular expression to parse this??

> /usr/local/lib/python3.6/dist-packages/geffnet/layers.py(7)versiontuple()
      5 
      6 def versiontuple(v):
----> 7     return tuple(map(int, (v.split("."))))[:3]
      8 
      9 

ipdb> v
'1.3.0+cu100'
ipdb> v.split(".")
['1', '3', '0+cu100']

When I use mmdetection to load mixnet_s I get an error

Traceback (most recent call last):
File "train.py", line 138, in
main()
File "train.py", line 116, in main
cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/builder.py", line 43, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/builder.py", line 15, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/utils/registry.py", line 76, in build_from_cfg
return obj_cls(**args)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/detectors/retinanet.py", line 16, in init
test_cfg, pretrained)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/detectors/single_stage.py", line 31, in init
self.init_weights(pretrained=pretrained)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/detectors/single_stage.py", line 35, in init_weights
self.backbone.init_weights(pretrained=pretrained)
File "/home/hs/hao/contextnet/mmdetection-master/mmdet/models/backbones/mixnet.py", line 837, in init_weights
load_checkpoint(self, pretrained, strict=False, logger=logger)
File "/home/hs/anaconda3/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 181, in load_checkpoint
'No state_dict found in checkpoint file {}'.format(filename))
RuntimeError: No state_dict found in checkpoint file /home/hs/hao/contextnet/mmdetection-master/weights/mixnet/mixnet_m-4647fc68.pth

Export to ONNX Error

Hi, thanks for the great work. Following my question here , I tried to convert to ONNX using this repo. But I got several errors.

By inputting this command just like your example, I got segmentation fault error :

sudo python3 onnx_export.py --model mobilenetv3_100 ./mobilenetv3_100.onnx
==> Creating PyTorch mobilenetv3_100 model
==> Exporting model to ONNX format at './mobilenetv3_100.onnx'
==> Loading and checking exported model from './mobilenetv3_100.onnx'
Segmentation fault

When I tried with efficientnet_b0 using checkpoint and not using checkpoint
sudo python3 onnx_export.py --model efficientnet_b0 ./efficientnet.onnx
or
sudo python3 onnx_export.py --model efficientnet_b0 --checkpoint ../train/model_best.pth.tar --num-classes 30 ./efficientnet.onnx

I got Couldn't export Python operator SwishAutoFn error.

==> Creating PyTorch efficientnet_b0 model
=> Loading checkpoint '../train/20191015-Deepeye36k-efficientnet_b0-224/model_best.pth.tar'
=> Loaded checkpoint '../train/20191015-Deepeye36k-efficientnet_b0-224/model_best.pth.tar'
==> Exporting model to ONNX format at './efficientnet.onnx'
Traceback (most recent call last):
  File "onnx_export.py", line 75, in <module>
    main()
  File "onnx_export.py", line 59, in main
    input_names=input_names, output_names=output_names)
  File "/home/ivan/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 26, in _export
    result = utils._export(*args, **kwargs)
  File "/home/ivan/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 394, in _export
    operator_export_type, strip_doc_string, val_keep_init_as_ip)
RuntimeError: ONNX export failed: Couldn't export Python operator SwishAutoFn

Any help would be appreciated, thanks

Segmentation fault (core dumped)

When run python3 onnx_export.py --model tf_efficientnet_b6_ns ./tf_efficientnet_b6_ns.onnx, I got this error:
==> Loading and checking exported model from './tf_efficientnet_b6_ns.onnx'
Segmentation fault (core dumped)
how to fix this?

torch.onnx.export failed

Hi @rwightman

Good morning. I am impressed by your work.

I try to export your model to onnx as following:

x = torch.rand(1, 3, 352, 1216)
torch.onnx.export(net, x, "./sqnet_3566.onnx")
print("export onnx successfully!")

But I encountered the following error message:

Using cache found in /home/paul/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master
/home/paul/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master/geffnet/conv2d_layers.py:47: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return max((-(i // -s) - 1) * s + (k - 1) * d + 1 - i, 0)
/home/paul/pytorch/AdaBins/models/layers.py:19: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  embeddings = embeddings + self.positional_encodings[:embeddings.shape[2], :].T.unsqueeze(0)
Traceback (most recent call last):
  File "rk3566Test.py", line 273, in <module>
    export_pytorch_model()
  File "rk3566Test.py", line 55, in export_pytorch_model
    torch.onnx.export(net, x, "./sqnet_3566.onnx")
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/__init__.py", line 208, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/utils.py", line 92, in export
    use_external_data_format=use_external_data_format)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/utils.py", line 530, in _export
    fixed_batch_size=fixed_batch_size)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/utils.py", line 384, in _model_to_graph
    fixed_batch_size=fixed_batch_size, params_dict=params_dict)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/utils.py", line 188, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/__init__.py", line 241, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/utils.py", line 791, in _run_symbolic_function
    return symbolic_fn(g, *inputs, **attrs)
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 128, in wrapper
    args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 128, in <listcomp>
    args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
  File "/home/paul/rknn2/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 81, in _parse_arg
    "', since it's not constant, please try to make "
RuntimeError: Failed to export an ONNX attribute 'onnx::Cast', since it's not constant, please try to make things (e.g., kernel size) static if possible

I did change the _EXPORTABLE flag to True in config.py as following:

# Set to True if exporting a model with Same padding via ONNX
_EXPORTABLE = True

Is this the right way to do it? or I miss something else? Could you please help? Thanks a lot.

some question about transform-learning with this method

from geffnet.mobilenetv3 import mobilenetv3_rw, mobilenetv3_small_075
net = mobilenetv3_small_075()

########################modify###################
# self.global_pool = nn.AdaptiveAvgPool2d(1)
# self.conv_head = select_conv2d(in_chs, num_features, 1, padding=pad_type, bias=head_bias)
# self.act2 = act_layer(inplace=True)
# self.classifier = nn.Linear(num_features, num_classes)
def as_sequential(self):
layers = [self.conv_stem, self.bn1, self.act1]
layers.extend(self.blocks)
# layers.extend([
# self.global_pool, self.conv_head, self.act2,
# nn.Flatten(), nn.Dropout(self.drop_rate), self.classifier])
return nn.Sequential(*layers)

def features(self, x):
    x = self.conv_stem(x)
    x = self.bn1(x)
    x = self.act1(x)
    x = self.blocks(x)
    # x = self.global_pool(x)
    # x = self.conv_head(x)
    # x = self.act2(x)
    return x
def forward(self, x):
    x = self.features(x)
    # x = x.flatten(1)
    if self.drop_rate > 0.:
        x = F.dropout(x, p=self.drop_rate, training=self.training)
    return x

I'm not sure if this model modification method is correct.
For complex models, I'm falling into the quagmire , and I hope to get some suggestions.

Apply correct transforms to data

Hi Ross,

Thanks for making this easily accessible repository !

I apologize in advance I'm opening an issue to ask about usage, but it'll pass a bit more until geffnet will have its own tag on stackoverflow ๐Ÿ˜Š

I am interested in using pretrained models with a new dataset. The problem is that this new dataset is not yet in a good shape: images have different sizes, so when I try to load an ImageDataBunch I get warnings about this. I could apply size parameter, but I wanted to ensure that all the transformations are consistent with the model.

Is there a way to get the transforms needed for a model? I have seen the data folder, together with create_loader, but it is not clear how to use this with ImageDataBunch.

Any advice would be much appreciated !

MobileNetV3 -- Pretrained Model

Hey @rwightman, I need to use MobileNetv3-large pretrained model for my backbone but couldn't find the pretrained model in your repo! is still tf_mobilenetv3_large_100 available for download?

Thanks!

What is the simplest way to convert trained Pytorch-weights of EfficientNet-Lite model back to Tensorflow?

@rwightman Hi,

What is the simplest way to convert trained Pytorch-weights of EfficientNet-Lite model back to Tensorflow?
If I want to train EfficientNet-Lite model in Pytorch and then use it in TensorFlow (pb/tflite)?

I have done it by using https://github.com/onnx/onnx-tensorflow but it adds extra transpose layers and use multiple conv2d-layers instead of one GroupedConv2D, so Pytorch->ONNX->TensorFlow-model is 10x times slower than the native TensorFlow model: onnx/onnx-tensorflow#782

ncnn int8

Hi, I successfully get ncnn int8 model of efficientnet_b0 followed by your code:
pytorch --> onnx --> ncnn --> ncnn int8

However, when I do it on tf_mobilenetv3_small_075 model, I got some error in the last step:
./ncnn2table --param tf_mobilenetv3_small_075-sim.param --bin tf_mobilenetv3_small_075-sim.bin --images imgs/ --output tf_mobilenetv3_small_075-sim.table --mean 127.5,127.5,127.5 --norm 0.0078125,0.0078125,0.0078125 --size 224,224 --thread 2 (which succeed in efficientnet_b0 model)
--- ncnn post training quantization tool --- 07:49:29 Nov 7 2019
param = 'tf_mobilenetv3_small_075-sim.param'
bin = 'tf_mobilenetv3_small_075-sim.bin'
images = '/nfs/project/lijian_i/onnx-simplifier-master/imgs/'
output = 'tf_mobilenetv3_small_075-sim.table'
mean = '127.5,127.5,127.5'
norm = '0.0078125,0.0078125,0.0078125'
size = '224,224'
thread = '2'
====> Quantize the parameters.
====> Quantize the activation.
====> step 1 : find the max value.
[1] 2429 segmentation fault (core dumped) /home/luban/ncnn-master/build/tools/quantize/ncnn2table --param --bin

I was wondering is that input sized changed in TF ported weight or other issues?
Thanks for advance!

efficientnet_b0 acc1 = 0.65032

I load the efficientnet_b0 by model = torch.hub.load('rwightman/gen-efficientnet-pytorch', 'efficientnet_b0', pretrained=True)
then I convert the model to onnx. and the eval result is top1_acc=0.65032, top5_acc=0.8718

tf_efficientnet_lite ONNX export

Hi again,

As per my previous issue, efficientnet_b0 weights exported to ONNX perfectly. But I went to try tf_efficientnet_lite2 today and I have encountered a similar issue to #34 . However the thing which is confusing me is that it works on Colab and not on my local machine with the a virtual environment with the exact same torch 1.5.0+cu101 packaged installed via pip (as opposed to conda for the sake of comparison) and exact same code. Could you please point me in the right direction to fix this problem? Thank you!

Here is the Colab notebook for reference: https://colab.research.google.com/drive/1jG7mSPitb-acA7EgMFaMsG3u57c51Bbe

And this what my local machine is telling me each time I run the exact same code.

(venv) andrewl@albona:/albona/nobackup/andrewl/test$ python3 test.py 
/pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/geffnet/conv2d_layers.py:39: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return max((math.ceil(i / s) - 1) * s + (k - 1) * d + 1 - i, 0)
/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/geffnet/conv2d_layers.py:39: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return max((math.ceil(i / s) - 1) * s + (k - 1) * d + 1 - i, 0)
Traceback (most recent call last):
  File "test.py", line 23, in <module>
    output_names=['output'],
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/__init__.py", line 168, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/utils.py", line 69, in export
    use_external_data_format=use_external_data_format)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/utils.py", line 488, in _export
    fixed_batch_size=fixed_batch_size)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/utils.py", line 351, in _model_to_graph
    fixed_batch_size=fixed_batch_size, params_dict=params_dict)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/utils.py", line 154, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/__init__.py", line 199, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/utils.py", line 740, in _run_symbolic_function
    return op_fn(g, *inputs, **attrs)
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 128, in wrapper
    args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 128, in <listcomp>
    args = [_parse_arg(arg, arg_desc) for arg, arg_desc in zip(args, arg_descriptors)]
  File "/nb/andrewl/anaconda3/envs/venv/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 81, in _parse_arg
    "', since it's not constant, please try to make "
RuntimeError: Failed to export an ONNX attribute 'onnx::Cast', since it's not constant, please try to make things (e.g., kernel size) static if possible

I've also collected my system information for you here:

Collecting environment information...
PyTorch version: 1.5.0+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Debian GNU/Linux 10 (buster)
GCC version: (Debian 8.3.0-6) 8.3.0
CMake version: Could not collect

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.2.148
GPU models and configuration: 
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti

Nvidia driver version: 418.74
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.16.2
[conda] numpy                     1.18.3                   pypi_0    pypi
[conda] torch                     1.5.0+cu101              pypi_0    pypi
[conda] torchvision               0.6.0+cu101              pypi_0    pypi

And the information from the Colab notebook:

Collecting environment information...
PyTorch version: 1.5.0+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.12.0

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: Tesla P100-PCIE-16GB
Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.18.3
[pip3] torch==1.5.0+cu101
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.3.1
[pip3] torchvision==0.6.0+cu101
[conda] Could not collect

Unable to create models with current pip release (0.9.8)

The model creation fails with current PyPI release of this library (0.9.8).

When installed from the sources

pip install git+https://github.com/rwightman/gen-efficientnet-pytorch.git

There isn't any error and model is being instantiated properly. Seems like some fixes are not yet published to PyPI

Mish inplace question

Hi, thanks for the great repo! What's the reason that the x.mul_(inner) line is commented out? Does the commented line return different results? Thanks!

def mish(x, inplace: bool = False):
"""Mish: A Self Regularized Non-Monotonic Neural Activation Function - https://arxiv.org/abs/1908.08681
"""
return x.mul(F.softplus(x).tanh())
#return x.mul_(inner) if inplace else x.mul(inner) # unexpected inplace issue with this

torch2onnx Segmentation fault (core dumped)

Hi,
I want to convert efficientnet_b0.pth to efficientnet_b0.onnx, when I runpython onnx_export.py --model=efficientnet_b0 res.onnx --checkpoint=../efficientnet_b0.pth:

==> Creating PyTorch efficientnet_b0 model
=> Loading checkpoint '../efficientnet_b0.pth'
=> Loaded checkpoint '../efficientnet_b0.pth'
==> Exporting model to ONNX format at 'res.onnx'
==> Loading and checking exported model from 'res.onnx'
Segmentation fault (core dumped)

what should I do?

[bug] cannot set num_features

When I try:
model = torch.hub.load('rwightman/gen-efficientnet-pytorch', 'efficientnet_b0', in_chans=1, num_features=16)

or

model = torch.hub.load('rwightman/gen-efficientnet-pytorch', 'efficientnet_b0', in_chans=1, num_features=16, pretrained=False)
I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\t-maserr\AppData\Local\Continuum\anaconda3\envs\python37\lib\site-packages\torch\hub.py", line 359, in load
    model = entry(*args, **kwargs)
  File "C:\Users\t-maserr/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master\geffnet\gen_efficientnet.py", line 636, in efficientnet_b0
    'efficientnet_b0', channel_multiplier=1.0, depth_multiplier=1.0, pretrained=pretrained, **kwargs)
  File "C:\Users\t-maserr/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master\geffnet\gen_efficientnet.py", line 430, in _gen_efficientnet
    **kwargs,
TypeError: type object got multiple values for keyword argument 'num_features

torch.jit.script(model) crashed

torch.hub.list('rwightman/gen-efficientnet-pytorch')
model = torch.hub.load('rwightman/gen-efficientnet-pytorch', 'efficientnet_b0', pretrained=True, num_classes=3)
scriptedmodel = torch.jit.script(model)

got:

=> Discarding pretrained classifier since num_classes != 1000
Traceback (most recent call last):
  File "./scripts/update_model.py", line 122, in <module>
    scriptedmodel = torch.jit.script(model)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/__init__.py", line 1516, in script
    return torch.jit._recursive.create_script_module(obj, torch.jit._recursive.infer_methods_to_compile)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 318, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 372, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/__init__.py", line 1900, in _construct
    init_fn(script_module)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 353, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, infer_methods_to_compile)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 376, in create_script_module_impl
    create_methods_from_stubs(concrete_type, stubs)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 292, in create_methods_from_stubs
    concrete_type._create_methods(defs, rcbs, defaults)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/annotations.py", line 290, in try_ann_to_type
    torch.jit._recursive_compile_class(ann, loc)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/__init__.py", line 1359, in _recursive_compile_class
    _compile_and_register_class(obj, rcb, _qual_name)
  File "/home/security/.local/lib/python3.8/site-packages/torch/jit/__init__.py", line 1363, in _compile_and_register_class
    _jit_script_class_compile(qualified_name, ast, rcb)
RuntimeError: 
Tried to access nonexistent attribute or method 'saved_tensors' of type '__torch__.geffnet.activations.activations_me.SwishJitAutoFn'. Did you forget to initialize an attribute in __init__()?:
  File "/home/security/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master/geffnet/activations/activations_me.py", line 50
    @staticmethod
    def backward(ctx, grad_output):
        x = ctx.saved_tensors[0]
            ~~~~~~~~~~~~~~~~~ <--- HERE
        return swish_jit_bwd(x, grad_output)
'SwishJitAutoFn.backward' is being compiled since it was called from '__torch__.geffnet.activations.activations_me.SwishJitAutoFn'
  File "/home/security/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master/geffnet/activations/activations_me.py", line 63
    def forward(self, x):
        return SwishJitAutoFn.apply(x)
               ~~~~~~~~~~~~~~ <--- HERE
'__torch__.geffnet.activations.activations_me.SwishJitAutoFn' is being compiled since it was called from 'SwishMe.forward'
  File "/home/security/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master/geffnet/activations/activations_me.py", line 63
    def forward(self, x):
        return SwishJitAutoFn.apply(x)
               ~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.