Giter Club home page Giter Club logo

neural-networks-demystified's People

Contributors

aaronjohnson avatar const-bon avatar forcebru avatar harshit-sh avatar jasonhamilton avatar kavikick avatar raviriley avatar stephencwelch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neural-networks-demystified's Issues

import statements missing?

I had to add an import for numpy as np, ... and now I see 'plot()' used without declaration. Am I missing something obvious? e.g. in pt2 plot(testInput, sigmoid(testInput), linewidth= 2)

Regularization parameter typo?

Hey Stephen,
Thanks a LOT for this awesome series! :D I can't imagine the amount of time it must have taken to prepare all this content!

So, I think I noticed a typo in ln [20] of part 7 ( https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/Part%207%20Overfitting%2C%20Testing%2C%20and%20Regularization.ipynb )

delta3 = np.multiply(-(y-self.yHat), self.sigmoidPrime(self.z3))
#Add gradient of regularization term:
dJdW2 = np.dot(self.a2.T, delta3)/X.shape[0] + self.lambd*self.W2

delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2)
#Add gradient of regularization term:
dJdW1 = np.dot(X.T, delta2)/X.shape[0] + self.lambd*self.W1

The regularization parameter is supposed to be self.Lambda but in the above snippet, you use self.lambd - I'm guessing this is a typo?

Also, I noticed that video 7 does not have any mention of the '/X.shape[0]' term inside the costFunctionPrime function! (https://youtu.be/S4ZUwgesjS8?t=4m58s) Maybe you could add an annotation about this missing term there? BTW, in the video the regularization parameter is being referred to as self.Lambda

Cheers!
Jayanth Krishnan

basic linear algebra question

thank you for a legendary tutorial on the basics of neural nets and gradient descent. I understood the derivation of the gradient (3 applications of the chain rule! whoa!!) but why did you transpose the matrix โˆ‚z3/โˆ‚W? At about 4:45 in part 4 you had to multiply the back propagating error (delta 3) which is a 3x1 matrix by a3(a 3x3 matrix). You commuted delta 3 and a3 . But matrix multiplication is not commutative.
And you transposed a3 to boot!

These two operations seem arbitrary. Why are they valid?
Why did you simply not take the dot product of delta3 and a3
(this way you would get a 3 x 1 matrix

  • S1
  • S2
  • S3
  • a1-1....a1-2....a1-3
  • a2-1....a2-2....a2-3
  • a3-1....a3-2....a3-3
  • S1 * (a1-1..+..a1-2..+..a1-3)
  • S2 * (a2-1..+..a2-2..+..a3-3)
  • S3* (a3-1..+..a3-2..+..a3-3)
And the result was a 3 x 1 matrix - I imagine this is the new gradient?

Several outputnodes?

I have been trying to use two output nodes. I made some changes to the data to fit:

self.outputLayerSize = 2

and

Training Data:

trainX = np.array(([3,5], [5,1], [10,2], [6,1]), dtype=float)
trainY = np.array(([1,0], [0,1], [0,1], [1,0]), dtype=float)

Testing Data:

testX = np.array(([4, 5], [4,1], [9,2], [6, 2]), dtype=float)
testY = np.array(([1,0], [0,1], [0,1], [1,0]), dtype=float)

Normalize:

trainX = trainX/np.amax(trainX, axis=0)
trainY = trainY/np.amax(trainY, axis=0)

Normalize by max of training data:

testX = testX/np.amax(trainX, axis=0)
testY = testY/np.amax(trainY, axis=0)

However, I get some errors trying to use this when training the network

D:\exjobb\NN.py in train(self, trainX, trainY, testX, testY)
113 options = {'maxiter': 200, 'disp' : True}
114 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS',
--> 115 args=(trainX, trainY), options=options, callback=self.callbackF)
116
117 self.N.setParams(_res.x)

.... [more callbacks]

resulting in

scalar_search_wolfe1(phi, derphi, phi0, old_phi0, derphi0, c1, c2, amax, amin, xtol)
153
154 if old_phi0 is not None and derphi0 != 0:
--> 155 alpha1 = min(1.0, 1.01_2_(phi0 - old_phi0)/derphi0)
156 if alpha1 < 0:
157 alpha1 = 1.0

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So from the error it looks like optimize.minimize can't handle more than one element in Y, which for me sounds quite odd.

Is it anythings else i need to change in order to make your NN work with several outputs?

Incorrect multiplication

  def forward(self, X):
        #Propogate inputs though network
        self.z2 = np.dot(X, self.W1)
        self.a2 = self.sigmoid(self.z2)
        self.z3 = np.dot(self.a2, self.W2)
        yHat = self.sigmoid(self.z3) 
        return yHat

I think you may want to be doing z2 = np.dot(X, self.W1.T) instead of z2 = np.dot(X, self.W1) given your explanation of weights.

costFunction in Lecture 7

Hey everybode,

if J in the cost function is calculated like given in lecture 7

 J = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(self.W1**2)+sum(self.W2**2))

J ends up being a vector what causes in my believe the following error

File "C:\Program Files\Anaconda\lib\site-packages\scipy\optimize\linesearch.py", line 148, in scalar_search_wolfe1
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The problem is solved by adding another sum() to the calculation of J

    J2 = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(sum(self.W1**2))+sum(self.W2**2))

after that J is a scalar again.

Did you miss the sum() or did I miss something (and ended up doing something mathematically incorrent).

Best regards and thanks in advance!

3d modeling

fig = plt.figure() and ax = fig.gca(projection='3d') needs to be ax = plt.figure().add_subplot(projection='3d') part7 and also the other 3d models
#3D plot:
#Uncomment to plot out-of-notebook (you'll be able to rotate)
#%matplotlib qt

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection='3d')

#Scatter training examples:
ax.scatter(10X[:,0], 5X[:,1], 100*y, c='k', alpha = 1, s=30)

surf = ax.plot_surface(xx, yy, 100*allOutputs.reshape(100, 100),
cmap=cm.jet, alpha = 0.5)

ax.set_xlabel('Hours Sleep')
ax.set_ylabel('Hours Study')
ax.set_zlabel('Test Score')

No partSeven.py?

Just wanted to test out the final code for myself but didn't see a python file for the final version of the code.

Something wrong with costfunction

Hi Stephen, i watched your video about this code, very good by the way,
but when i tryed to implement i've got this error:

Traceback (most recent call last):
in computeNumericalGradient
numgrad[p] = (loss2 - loss1) / (2*e)
ValueError: setting an array element with a sequence.

and when i forked your code, the error persists, the costfunction shoud return a number? or am i wrog?

thank's,
Felipe Melo

Dimension Error on Training the NN

Hi,

I tried to implement the same code, but changed the layers as shown:

    self.inputLayerSize = 3 
    self.outputLayerSize = 5 
    self.hiddenLayerSize = 5

This is because the data set shape is:
X = (4162, 3)
Y = (4162,)

However, after executing T.train(X, Y), I get the following error:


ValueError Traceback (most recent call last)
in ()
----> 1 T.train(X, Y)

in train(self, X, y)
26
27 options = {'maxiter': 200, 'disp' : True}
---> 28 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS', args=(X, y), options=options, callback=self.callbackF)
29
30 self.N.setParams(_res.x)

//anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.pyc in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
439 return _minimize_cg(fun, x0, args, jac, callback, **options)
440 elif meth == 'bfgs':
--> 441 return _minimize_bfgs(fun, x0, args, jac, callback, **options)
442 elif meth == 'newton-cg':
443 return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options)
845 else:
846 grad_calls, myfprime = wrap_function(fprime, args)
--> 847 gfk = myfprime(x0)
848 k = 0
849 N = len(x0)

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in function_wrapper(*wrapper_args)
287 def function_wrapper(wrapper_args):
288 ncalls[0] += 1
--> 289 return function(
(wrapper_args + args))
290
291 return ncalls, function_wrapper

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in derivative(self, x, *args)
69 return self.jac
70 else:
---> 71 self(x, *args)
72 return self.jac
73

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in call(self, x, *args)
61 def call(self, x, *args):
62 self.x = numpy.asarray(x).copy()
---> 63 fg = self.fun(x, *args)
64 self.jac = fg[1]
65 return fg[0]

in costFunctionWrapper(self, params, X, y)
10 def costFunctionWrapper(self, params, X, y):
11 self.N.setParams(params)
---> 12 cost = self.N.costFunction(X, y)
13 grad = self.N.computeGradients(X,y)
14

in costFunction(self, X, y)
29 #Compute cost for given X,y, use weights already stored in class.
30 self.yHat = self.forward(X)
---> 31 J = 0.5*np.sum((y-self.yHat)**2)
32 return J
33

ValueError: operands could not be broadcast together with shapes (4162,) (4162,5)

Would appreciate the help!

a probable mistake?

In "Part 4 Backpropagation.ipynb, the 'variables' table":
Dimension of Cost, J, should always be (1,1), instead of '(1, outputLayerSize)', right?

where is predict function?

Dear Stephen,

I learnt lots of things from your videos and pyhton codes. Thank you! However I could not find a predict function in your code. Is it missing? I think It would be nice to add this function.

Best,

Halil Agin.

ipython 4.0.0 loses "--pylab inline" functionality recommended in README.md

notebook --pylab inline
[E 11:11:32.373 NotebookApp] Support for specifying --pylab on the command line has been removed.
[E 11:11:32.373 NotebookApp] Please use `%pylab inline` or `%matplotlib inline` in the notebook itself.
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython notebookip
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython --version
4.0.0

... this is recommended in https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/README.md presumably to include e.g. numpy as np.

I believe this setup should also work:

cat ~/.ipython/profile_default/startup/go.py 
import numpy as np

Multiplying two matrices??

delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2)
This line appears in a function (forward) in the code. Whereby I noticed that z2 was a matrix (per your explanation). I am not that proficient in the python programming language so may not know what it means, but how does np.dot and * differ? The reason I ask this is because I discovered an anomaly in my code while trying to implement this by making a model with inputLayerSize=1, hiddenLayerSize=2 and outputLayerSize=1. Essentially, it should give me predictions for y for the equation y=2x (without knowing the equation firsthand, ofcourse). But when finding delta2 I am stuck because (W2) Transpose is of the order 1 x 2, del3 is of order 1 x 1 and z2 is of the order 1 x 2. How do I matrix multiply del3.W2 and sigmoidPrime(z2) ? (P.S. I was not implementing it in python) and thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.