Giter Club home page Giter Club logo

ml-glossary's Introduction

Machine Learning Glossary

Looking for fellow maintainers!

Apologies for my non-responsiveness. :( I've been heads down at Cruise, buiding ML infra for self-driving cars, and haven't reviewed this repo in forever. Looks like we're getting 54k monthly active users now and I think the repo deserves more attention. Let me know if you would be interested in joining as a maintainer with priviledges to merge PRs.

View The Glossary

How To Contribute

  1. Clone Repo
git clone https://github.com/bfortuner/ml-glossary.git
  1. Install Dependencies
# Assumes you have the usual suspects installed: numpy, scipy, etc..
pip install sphinx sphinx-autobuild
pip install sphinx_rtd_theme
pip install recommonmark

For python-3.x installed, use:

pip3 install sphinx sphinx-autobuild
pip3 install sphinx_rtd_theme
pip3 install recommonmark
  1. Preview Changes

If you are using make build.

cd ml-glossary
cd docs
make html

For Windows.

cd ml-glossary
cd docs
build.bat html
  1. Verify your changes by opening the index.html file in _build/

  2. Submit Pull Request

Short for time?

Feel free to raise an issue to correct errors or contribute content without a pull request.

Style Guide

Each entry in the glossary MUST include the following at a minimum:

  1. Concise explanation - as short as possible, but no shorter
  2. Citations - Papers, Tutorials, etc.

Excellent entries will also include:

  1. Visuals - diagrams, charts, animations, images
  2. Code - python/numpy snippets, classes, or functions
  3. Equations - Formatted with Latex

The goal of the glossary is to present content in the most accessible way possible, with a heavy emphasis on visuals and interactive diagrams. That said, in the spirit of rapid prototyping, it's okay to to submit a "rough draft" without visuals or code. We expect other readers will enhance your submission over time.

Why RST and not Markdown?

RST has more features. For large and complex documentation projects, it's the logical choice.

Top Contributors

We're big fans of Distill and we like their idea of offering prizes for high-quality submissions. We don't have as much money as they do, but we'd still like to reward contributors in some way for contributing to the glossary. For instance a cheatsheet cryptocurreny where tokens equal commits ;). Let us know if you have better ideas. In the end, this is an open-source project and we hope contributing to a repository of concise, accessible, machine learning knowledge is enough incentive on its own!

Tips and Tricks

Resources

ml-glossary's People

Contributors

aakashjhawar avatar adrienlemaire avatar andrewdalpino avatar andybergon avatar ashok93 avatar ayushsenapati avatar backtrack-5 avatar baelfirenightshd avatar bfortuner avatar bkowshik avatar chemformalixer avatar computervisionpro avatar davneet4u avatar devksingh4 avatar girishkuniyal avatar gwydionjon avatar imavijit avatar ivanistheone avatar joelgenter avatar ketulgupta1995 avatar meldo8 avatar nwanna-joseph avatar pablovela5620 avatar parthvadhadiya avatar prashplus avatar rsettlage avatar sbalan7 avatar theandyb avatar xiaoye-hua avatar yelircaasi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-glossary's Issues

Optimizers should be in reverse order

Currently the most complex optimiser is at the the top of the page and the simplest is at the bottom. Given that more complex optimizers are evolutions of the simple ones, it would make more sense to have the simplest first.

MSE loss

Given the loss functions calculate the loss of a batch (which is common),

def MSE(yHat, y):
    return np.sum((yHat - y)**2) / 2.0

should be:

def MSE(yHat, y):
    return np.sum((yHat - y)**2) / y.size 

and for float values:

def MSE(yHat, y):
    return np.sum((yHat.astype(float) - y)**2) / y.size

The cost function code return opposite sign

#Take the error when label=1
class1_cost = -labels*np.log(predictions)

#Take the error when label=0
class2_cost = (1-labels)*np.log(1-predictions)

#Take the sum of both costs
cost = class1_cost + class2_cost

In this code, it seem like class1 return positive cost and class2 return negative cost, wouldn't they cancel when added?

Example for epsilon incorrect?

In page ml-glossary/docs/math_notation.rst the example for $$\epsilon$$ is given as learning rate. But the "e" in learning rate denotes exponent (10^-4).

Tools for generating figures

Hey, I am currently working on section "Activation Functions". Since I want to keep the consistency of the figures of the function, could you note the tools that you use to generate them? ๐Ÿ˜„ @bfortuner

Cross Entropy "code" example seems like the arguments are reversed

In the Loss-Functions > Cross Entropy "code" example the arguments seem reversed.

Location: http://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy

It currently reads:

import numpy as np

def CrossEntropy(yHat, y):
    if yHat == 1:
        return -np.log(y)
    else:
        return -np.log(1 - y)
    
print 'true: ', CrossEntropy(.1, 1)
print 'false:', CrossEntropy(.8, 1)

# true:  inf
# false: inf

It should read?

def CrossEntropy2(yHat, y):
    if y == 1:
        return -np.log(yHat)
    else:
        return -np.log(1 - yHat)
    
print 'true: ', CrossEntropy2(.1, 1)
print 'false:', CrossEntropy2(.8, 1)

# true:  2.3025850929940455
# false: 0.2231435513142097

Feature request: add data requirements section for algorithms

It would be really useful if the descriptions for linear regression, logistic regression and algorithms had a section that described data requirements/expectations. For example, from what I understand, least squares estimates for regression models are highly sensitive to (not robust against) outliers.

Linear regression bias clarification

I just want to clarify my understanding before making any clarifying changes. In the Linear Regression article under 'Bias Term', it reads:

Below we add a constant 1 to our features matrix. By setting this value to 1, it turns our bias term into a constant.

bias = np.ones(shape=(len(features),1))
features = np.append(bias, features, axis=1)

So the purpose of adding the 1 along with the other features in each example is so that the 1 will be multiplied by the 'bias weight' when the dot product of the features and weights is performed in the predict() function. Is that accurate?

Zh shape

X Input (3, 1) Includes 3 rows of training data, and each row has 1 attribute (height, price, etc.)
Zh Hidden weighted input (1, 2) Computed by taking the dot product of X and Wh. The dimensions (1,2) are required by the rules of matrix multiplication. Zh takes the rows of in the inputs matrix and the columns of weights matrix. We then add the hidden layer bias matrix Bh.

https://github.com/bfortuner/ml-glossary/blob/master/docs/forwardpropagation.rst#id15

Should the Zh shape not be (3,2)?

Adding Vietnamese Translation

I would like to start translating your ml-glossary repo into Vietnamse and possibly contributing to both existing English and future Vietnamese versions.

Can I start my own repo and start translating like the mlbvn/d2l-vn translation of d2l-ai/d2l-en?

Also, I wondered if this is something already pursued by someone else?

Adding softmax activation function.

I liked the idea of the ml-cheatsheets to give a quick but concise overview of the concept. I would like to add the explanations for the softmax, Dying RELU problem. Is there any template I should follow?

Epub inline equations aren't rendered

In the Epub download, inline equations aren't rendered/formatted. They are just raw Sphinx syntax e.g.:

\[\begin{split}\begin{align} f'(W_1) = -x_1(y - (W_1 x_1 + W_2 x_2 + W_3 x_3)) \\ f'(W_2) = -x_2(y - (W_1 x_1 + W_2 x_2 + W_3 x_3)) \\ f'(W_3) = -x_3(y - (W_1 x_1 + W_2 x_2 + W_3 x_3)) \end{align}\end{split}\]

This makes the Epub, which would otherwise be super useful, unreadable :(

Confusing expression in description of Simple Network...

Hi,

I find the following description in the Simple Network to be a bit confusing:

Prediction = A(\;A(\;X W_h\;)W_o\;)
Where A is an activation function like :ref:`activation_relu`, X is the input and W_h and W_o are weights.

I think I get your point, namely that the expression on the right hand side of Prediction is the approximate pseudo mathematical expression of the simple network, and where A is used to representation an arbitrary mathematical function that takes a matrix as an argument and that returns a matrix (there's a lot of capital letters in that expression, not all of which are matrices). Unfortunately, for me at least, the use of the capital A is confusing, and takes a few moments to figure out that A is itself not a matrix. It might be clearer to replace A with f or some other lower case letter. Just a suggestion.

Missing dependency recommonmark

@bfortuner, thanks for getting us organized to put this together.
I cloned and installed the dependencies and found one to be missing: recommonmark
I am running the usual anaconda3 stack used in fast.ai/part2.
Perhaps you could add this dependency to the install guide.
pip install recommonmark

Open a dev branch for committing

Currently we have only master branch. We should have dev branch for the changes to be committed so that master is clean and production version.

Updating the glossary of activation functions

Currently the glossary of activation functions is limited to only a few functions. There are many newer functions such as Swish, Mish, Phish, Softplus, GELU, etc. that are missing from the glossary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.