Giter Club home page Giter Club logo

deep_learning-sgn's Introduction

Non - Gaussian Behaviour of Stochastic Gradient Noise in Deep Learning

Candidate Number: XXXXXXX

In recent years, there has been a growing interest in Stochastic Gradient Descent (SGD) and its modifications (just as AdaDelta and Adam) in the field of machine learning, mainly due to its computational efficiency. It is often assumed that gradient noise follows Gaussian distribution in large data-sets by invoking the classical Central Limit Theorem. However, the results in my report (will be published here shortly) shows that this is far from true, in fact we show that stochastic gradient noise (SGN) follows an alpha-stable distribution, which is a family of heavy tailed distribution where alpha is a tail index. For validation, we build two models from scratch by just the use of numpy for vector operations. We only use keras to import MNIST and Fashion-MNIST datasets. The models try to show results on two questions:

  • Does the choice of activation function have a big effect on distribution of SGN?: for this we run the tests using relu and sigmoid where the implementation can be found in model_epoch_vs_alpha.py. I have run a test on the file test_epoch_alpha_relu.py where the graphs can be seen in the folder mnist_activation. Please adjust this file accordingly to change the activation function and datasets. All the documentation are provided in the doc-strings in the models. I have also attached the jupyter notebook epoch_alpha.ipynb which although outdates and not well documented, shows you my progress and also plots.

  • Does the choice of learning rate effect the distribution of SGN?: for this we run the tests using relu and sigmoid and we adjust the learning rate from 0.001 to 0.1 with an increment which is user defined. The implementation can be found in model_lr_vs_alpha.py. I have run a test on the file test_lr_alpha.py where the graphs can be seen in the folder mnist_lr. My report is heavily based on the research paper: http://proceedings.mlr.press/v97/simsekli19a/simsekli19a.pdf which you meant find useful to understand what and why I have done this project

Installation:

Please feel free to clone this repository and play around with the code. I have tried to keep the documentation in the doc strings above the function as understandable as possible.

Hope you like it. Enjoy!!!!!!

deep_learning-sgn's People

Contributors

anmolaggarwal98 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.