Giter Club home page Giter Club logo

largemargininsoftmax's Introduction

Large Margin In Softmax Cross-Entropy Loss

The Pytorch implementation for the BMVC2019 paper of "Large Margin In Softmax Cross-Entropy Loss" by Takumi Kobayashi.

Citation

If you find our project useful in your research, please cite it as follows:

@inproceedings{kobayashi2019bmvc,
  title={Large Margin In Softmax Cross-Entropy Loss},
  author={Takumi Kobayashi},
  booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
  year={2019}
}

Contents

  1. Introduction
  2. Usage
  3. Results

Introduction

The proposed method works as a regularization for the standard softmax cross-entropy loss to promote the large-margin networks. So, it is noteworthy that the large margin can be embedded into neural networks, such as CNNs, by simply adding the proposed regularization without touching other components; we can use the same training procedures, such as optimizer, learning rate and training schedule. For the more detail, please refer to our paper.

Figure: Comparison of large-margin losses

Usage

Dependencies

Training

The softmax loss with the large-margin regularization can be simply incorporated by

from models.modules.myloss import LargeMarginInSoftmaxLoss
criterion = LargeMarginInSoftmaxLoss(reg_lambda=0.3)

where reg_lambda indicates the regularization parameter.

For example, the 13-layer network is trained on Cifar10 by using the following command

CUDA_VISIBLE_DEVICES=0 python cifar_train.py  --dataset cifar10  --data ./datasets/ --arch layer13  --config-name layer13_largemargin  --out-dir ./result/cifar10/layer13/LargeMarginInSoftmax/

The VGG-16 mod network [1] on ImageNet is also trained by

CUDA_VISIBLE_DEVICES=0,1,2,3 python imagenet_train.py  --dataset imagenet  --data ./datasets/imagenet12/images/  --arch vgg16bow_bn  --config-name imagenet_largemargin  --out-dir ./result/imagenet/vgg16bow_bn/LargeMarginInSoftmax/  --dist-url 'tcp://127.0.0.1:8080'  --dist-backend 'nccl'  --multiprocessing-distributed  --world-size 1  --rank 0

Note that the imagenet dataset must be downloaded at ./datasets/imagenet12/ before the training.

Results

These performance results are not the same as those reported in the paper because the methods are implemented by MatConvNet in the paper and accordingly trained in a (slightly) different training procedure.

Cifar-10

Network Loss Top-1 Err.
13-Layer SoftMax 8.45 (+-0.27)
13-Layer SoftMax with Large-Margin 7.81 (+-0.20)

Cifar-100

Network Loss Top-1 Err.
13-Layer SoftMax 29.42 (+-0.19)
13-Layer SoftMax with Large-Margin 27.61 (+-0.09)

ImageNet

Network Loss Top-1 Err.
VGG-16 mod [1] SoftMax 22.99
VGG-16 mod [1] SoftMax with Large-Margin 22.09
VGG-16 [2] SoftMax 25.04
VGG-16 [2] SoftMax with Large-Margin 24.08
ResNet-50 [3] SoftMax 23.45
ResNet-50 [3] SoftMax with Large-Margin 23.28
ResNeXt-50 [4] SoftMax 22.42
ResNeXt-50 [4] SoftMax with Large-Margin 22.27
DenseNet-169 [5] SoftMax 23.03
DenseNet-169 [5] SoftMax with Large-Margin 22.70

References

[1] T. Kobayashi. "Analyzing Filters Toward Efficient ConvNets." In CVPR, pages 5619-5628, 2018. pdf

[2] K. Simonyan and A. Zisserman. "Very Deep Convolutional Networks For Large-Scale Image Recognition." CoRR, abs/1409.1556, 2014.

[3] K. He, X. Zhang, S. Ren, and J. Sun. "Deep Residual Learning For Image Recognition." In CVPR, pages 770โ€“778, 2016.

[4] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He. "Aggregated Residual Transformations For Deep Neural Networks." In CVPR, pages 5987โ€“5995, 2017.

[5] G. Huang, Z. Liu, L. Maaten and K.Q. Weinberger. "Densely Connected Convolutional Networks." In CVPR, pages 2261-2269, 2017.

largemargininsoftmax's People

Contributors

tk1980 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.