levithomason / anny Goto Github PK

View Code? Open in Web Editor NEW

17.0 17.0 3.0 9.04 MB

Anny is an artificial neural network, yo!

Home Page: http://levithomason.github.io/anny/

HTML 2.28% JavaScript 95.52% CSS 2.20%

anny's Introduction

resume

My resume

anny's People

Contributors

Stargazers

Watchers

Forkers

sapper7 wrapperband dryzhkov

anny's Issues

Use more advanced learning algorithm

Momentum will improve learning. Save the previous weight value before updating it as it will be used in the momentum calcuation on the next learning cycle.

https://youtu.be/IruMm7mPDdM?t=12m32s

Network.train() should not log on succes/fail

Logging on success/fail was never intended to be permanent. It should be updated now. Consider using something like promise syntax for success/fail.

Create Neuron.Activation class or factory

Currently there is an activation namespace. Its members are activation function definitions. These are objects with func, prime, rangeMin, and rangeMax properties.

These objects should be created by a class or factory so they can be validated, extended, and created consistently. For better cohesion, this should be a static Neuron class or factory. Same for the namespace, it should likely be a Neuron namespace.

We should likely end up with something along the lines of:

// current namespace with its activations
Neuron.ACTIVATION.tanh

// create new activation
Neuron.ACTIVATION.create({
  name: 'newActivation',
  func: x => x,
  prime: x => x,
  rangeMin: 0,
  rangeMax: 1,
})

// => Neuron.ACTIVATION.newActivation

EDIT
Also, finish the docs in the Neuron class while at this. There are missing and incomplete doc strings here.

Add more training tests and examples

Ref RProp benchmarks found in this paper.

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=298623&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D298623

Remove mathjs dependency

This is a large library for which we are using only one function, sech, which is used in only one place ACTIVATION.optimalTanh. This is quite the overkill. We can easily add a sech method to the utils, or start our own minimal math util. It might even make sense to simply remove to optimalTanh entirely for now.

EDIT

Turns out the hyperbolic secant function is simply 1 / Math.cosh(x). http://www.mathworks.com/help/matlab/ref/sech.html

Support normal and derivative network error functions

Currently the Neuron.train() method uses the Neuron's error to calculate the delta (used to calculate the gradient and update the weights). This works OK for toy networks with a single output Nueuron. However, the back propagated delta may in fact need to be the derivative of the Neuron's input with respect to the total network error for that training sample, not with respect to the Neuron's particular error. This will represent the weight change needed to affect the total network error, not just that Neuron's error.

Bottom of this page explains it well (now that I know how to work with derivatives, thanks Khan academy!).

Create Network.Error class or factory

Just as with Neuron activations in #88, there are Network error functions in an ERROR namespace. We should do the same thing and pull these into a class or factory. For better cohesion, it should live on as a static on the Network class. You should be able to create new error functions with validation. When created, they should be added to the Network.ERROR namespace.

Add changelog

https://github.com/skywinder/github-changelog-generator

This should be added to the deploy step.

Validation before training

Before training, validate the training data.

Getting started docs

Creating a network
Training a network

Parallelize training

Use parallel.js to paralellize training across CPU cores. This shoud also free up rendering during training.

Adaptive Learning Rate

The learning rate should adapt for optimal training. Higher values in the output layer and decreases toward the input layer.

Implement algorithm 4.7 Choosing learning rates (pg 13)
http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf

No anonymous functions

Activation, Error, and Utils should be tested for function names. Ensure that object property names match function name.

Create Trainer class

The network train method is getting large. It also breaks the pattern of the Neuron train and Layer train methods. Neuron.train() update the weights, which makes sense. Layer.train() invokes the Neuron.train() methods which makes sense. Network.train() should invoke Layer.train() methods.

Because the overly complex Network.train method is there, the method that invokes the Layer.train() methods is awkwardly named correct().

The Network.train() method should be pulled into a class, Trainer. It should return a function that takes in a Network and trains it. It would also house the training options, default callbacks, and any other settings or config related to training (like batch and online training). Then, the Network.correct() method can be renamed more appropriately Network.train(). In the future, it may also support training different types of networks (convolutional, etc).

The Trainer API may end up looking something like this. It would allow us to have various training strategies.

const shortTrain = new Trainer({maxEpochs: 100})
const accurateTrain = new Trainer({errorThreshold: 0.000001})

// stops training if error is not going down
let lastError = Infinity
const improvingTrain = new Trainer({
  onProgress: (error, epoch) => {
    if (error > lastError) return false
    lastError = error
  }
})

// assume we have a net already made

shortTrain(someNetwork)
accurateTrain(someNetwork)
improvingTrain(someNetwork)

These are off the cuff toy examples to demonstrate the pattern

Something like shortTrain can be used to quickly test if a Network can train in a defined amount of time. Accurate train could be used to see if a network could reach a certain level of accuracy. Improving could be used to test how long a network could improve before having regression. All of these trainers could take in a single network config and generate performance stats for a given network in a clean and reusable way.

The trainers could can even be shared as part of challenges. See if your network can "beat the xyz trainer".

Trainer shuffle option

Simply _.shuffle the training data samples before training. Especially when using sorted training data, this will result in much better training results.

API Docs

Generate API docs from the existing doc strings.

Better way to add bias neuron

Creating a layer currently takes a bool to indicate whether or not to include a bias. This should be more declarative, based on network creation, no bias in the output layer.

Implement regularization

There are a few other tricks on this page that may help Anny's performance.

http://cs231n.github.io/neural-networks-2/

Implement batch training

Currently, Anny utilizes "online training" meaning weights are updated for every sample in the training set. "Batch training" is much more effective (weights are updated after each epoch).

https://youtu.be/IruMm7mPDdM?t=13m14s

Normalize training data before training

As of #83, activation ranges are now specified for all activation functions. Using this we can now scale and translate the training data to the optimal range for the activation functions being used in the network.

Setup docs and demo hosting

Currently, the repo is used to host on gh-pages. This is causing pains by having to ignore dist files, then force add them on the CI process. The master branch then can't be branched for features as it includes ignored files in git cache. We have to maintain an integration branch.

If we setup s3 hosting for instance, we can then simply push deploys there instead and ignore them in the repo.

Better weight initialization

Currently, when a Neuron connect()s to another, the connection weight can be set. If there is no connection weight specified, random weight initialization is performed based on the number of inputs to the Neuron.

Since the algorithm depends on the number of connections, future connections will cause previously initialized weights to be incorrect since they were based off a fewer number of connections.

The weights must be initialized at a later point in time, prior to training. Ideas for solving this:

Re-init on Neuron connect

When Neuron A connects to Neuron B, B's incoming weights should be re-initialized. A's outgoing do not need re-initialized since the initialization depends only on the number of incoming connections.

Pros

Weights are always initialized with the correct random values no matter the usage of Anny.

Cons

It is implicit and magical, opposed to some explicit method or setting.
When Layers connect, they loop through neurons connecting them to every other Neuron in the next Layer. There would be numerous duplicate and unnecessary weight initializations made during a Layer connect method. Only the final connection to each Neuron matters. Neurons only connect once (currently) so the immediate performance issue here is little to none.
If Neurons were ever added during or after training, the weight values would be randomized and the training progress lost. This is a big concern.

Init after Layer connect
This would solve many of the cons of the above option. After looping through all the Neurons and making the connections, a final pass through the weights could be made for initialization.

Pros

No wasted cycles on duplicate initialization
Any trained weights would be preserved when connecting new Neurons to a trained or training Network.

Cons

Weight initialization would only happen on Layer connect. This is obscure, nothing else could/would take advantage of the initialization.
It is still magical.

initializeWeights() method
This is the best option so far. The Network could have a method to initialize or randomize its weights. It could make a single pass through all weights and set them based on incoming connection counts (or any other heuristic).

Pros

Explicit, not magical
Can be easily extended to a Trainer() option
Would follow the same pattern as activate() and backprop() where the Network would call a method on all its Layers which would call a method on each Neuron. The Neuron would know how to init its own incoming weights.
Would allow re-training a network by simply calling train() again with the init weight option set. This would first randomize all the weights, clearing the previous learned values, then train new values.

Cons

Perhaps having to train users on one more feature. Though, this could reasonably be the default training option so you get the benefits without having to enable it. Or, this method could be used after making a new Network. Then, it would not have to be the default training method and you'd still get the benefits. Sold.

Add mocks for each class
Update tests to use mocks
Update all should throw() tests to test error messages (false positives otherwise)
Get 100% coverage

Support convolutional networks

http://andrew.gibiansky.com/blog/machine-learning/convolutional-neural-networks/

Separate layer types for convolution and pooling (max, etc). This is in addition to the regular single dimension layers

levithomason / anny Goto Github PK

anny's Introduction

resume

anny's People

Contributors

Stargazers

Watchers

Forkers

anny's Issues

Recommend Projects

Recommend Topics

Recommend Org