stephencwelch / neural-networks-demystified Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 569.0 2.22 MB

Supporting code for short YouTube series Neural Networks Demystified.

Python 0.51% Jupyter Notebook 99.49%

neural-networks-demystified's People

Contributors

Stargazers

Watchers

Forkers

haberland islenv vchollati ismael-vc paulalaurena219 skallumadi blamb25 andersonhaynes nausheenfatma imclab hasaniqbalanik dfrsg ahlusar1989 sskrobanski amithmathew gotoc lxc-xx nigh7sh4de kartikpadmanabhan davidatir krishnakalyan3 huayue21 machine-learning-1 varasinguluri delasalleuniversity-manila dpv101 mnghuang valonz tandon-aman deepnarainsingh rfrustad jvb04002 keisuke826 dkant75 binderwang ghellstern forcebru magonser naturegirl gutomcosta onkis sthalles dwinkelman netankit stavenator nabilblk itsrainingmani panovr priyanshu-sekhar xfil72 ouiameaek caitlinkuhlman mingyanzhao neo-hao johnhenning jack19921127 mubarak jiajiechen vanvictorlim madinghehe elixeus ml-ai-nlp-ir programmer-util yutarochan liying0420 jvemugunta howardlinus jayantdhankhar oreza vfarcy morrowsend tienhv derloi hyomea briansoule nmahoney1 badlogicmanpreet weifengshe bewer cbennett estellerigault fatmas1982 sharifulgeo punkrockio nchuramani amitmse warin2515 jaidevanoop harshit-sh ghoxha sakhawatsumit johnnyeric petroivaniuk xlees mcnull fonsecaeli guimarthe zbxzc35 bhargavpanth rinterested

neural-networks-demystified's Issues

import statements missing?

I had to add an import for numpy as np, ... and now I see 'plot()' used without declaration. Am I missing something obvious? e.g. in pt2 plot(testInput, sigmoid(testInput), linewidth= 2)

Regularization parameter typo?

Hey Stephen,
Thanks a LOT for this awesome series! :D I can't imagine the amount of time it must have taken to prepare all this content!

So, I think I noticed a typo in ln [20] of part 7 ( https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/Part%207%20Overfitting%2C%20Testing%2C%20and%20Regularization.ipynb )

delta3 = np.multiply(-(y-self.yHat), self.sigmoidPrime(self.z3))
#Add gradient of regularization term:
dJdW2 = np.dot(self.a2.T, delta3)/X.shape[0] + self.lambd*self.W2

delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2)
#Add gradient of regularization term:
dJdW1 = np.dot(X.T, delta2)/X.shape[0] + self.lambd*self.W1

The regularization parameter is supposed to be self.Lambda but in the above snippet, you use self.lambd - I'm guessing this is a typo?

Also, I noticed that video 7 does not have any mention of the '/X.shape[0]' term inside the costFunctionPrime function! (https://youtu.be/S4ZUwgesjS8?t=4m58s) Maybe you could add an annotation about this missing term there? BTW, in the video the regularization parameter is being referred to as self.Lambda

Cheers!
Jayanth Krishnan

basic linear algebra question

thank you for a legendary tutorial on the basics of neural nets and gradient descent. I understood the derivation of the gradient (3 applications of the chain rule! whoa!!) but why did you transpose the matrix ∂z3/∂W? At about 4:45 in part 4 you had to multiply the back propagating error (delta 3) which is a 3x1 matrix by a3(a 3x3 matrix). You commuted delta 3 and a3 . But matrix multiplication is not commutative.
And you transposed a3 to boot!

These two operations seem arbitrary. Why are they valid?
Why did you simply not take the dot product of delta3 and a3
(this way you would get a 3 x 1 matrix

S1 S2 S3	a1-1....a1-2....a1-3 a2-1....a2-2....a2-3 a3-1....a3-2....a3-3	S1 * (a1-1..+..a1-2..+..a1-3) S2 * (a2-1..+..a2-2..+..a3-3) S3* (a3-1..+..a3-2..+..a3-3)

And the result was a 3 x 1 matrix - I imagine this is the new gradient?

what plotting package are you using mate?

I'm confused - the ipynb files are actually XML

I thought the .ipynb files should be in JSON format.

Or have I got your intention completely misunderstood?

Having problem opening "Part 2 Forward Propagation.ipynb", "Part 3 Gradient Descent.ipynb" and "Part 4 Backpropagation.ipynb" via Jupyter notebook

Hello those files won't open via jupyter notebook

Windows 10 1809 home edition. Anaconda 4.7.12

Thanks in advance

Several outputnodes?

I have been trying to use two output nodes. I made some changes to the data to fit:

self.outputLayerSize = 2

and

Training Data:

trainX = np.array(([3,5], [5,1], [10,2], [6,1]), dtype=float)
trainY = np.array(([1,0], [0,1], [0,1], [1,0]), dtype=float)

Testing Data:

testX = np.array(([4, 5], [4,1], [9,2], [6, 2]), dtype=float)
testY = np.array(([1,0], [0,1], [0,1], [1,0]), dtype=float)

Normalize:

trainX = trainX/np.amax(trainX, axis=0)
trainY = trainY/np.amax(trainY, axis=0)

Normalize by max of training data:

testX = testX/np.amax(trainX, axis=0)
testY = testY/np.amax(trainY, axis=0)

However, I get some errors trying to use this when training the network

D:\exjobb\NN.py in train(self, trainX, trainY, testX, testY)
113 options = {'maxiter': 200, 'disp' : True}
114 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS',
--> 115 args=(trainX, trainY), options=options, callback=self.callbackF)
116
117 self.N.setParams(_res.x)

.... [more callbacks]

resulting in

scalar_search_wolfe1(phi, derphi, phi0, old_phi0, derphi0, c1, c2, amax, amin, xtol)
153
154 if old_phi0 is not None and derphi0 != 0:
--> 155 alpha1 = min(1.0, 1.01_2_(phi0 - old_phi0)/derphi0)
156 if alpha1 < 0:
157 alpha1 = 1.0

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So from the error it looks like optimize.minimize can't handle more than one element in Y, which for me sounds quite odd.

Is it anythings else i need to change in order to make your NN work with several outputs?

Incorrect multiplication

  def forward(self, X):
        #Propogate inputs though network
        self.z2 = np.dot(X, self.W1)
        self.a2 = self.sigmoid(self.z2)
        self.z3 = np.dot(self.a2, self.W2)
        yHat = self.sigmoid(self.z3) 
        return yHat

I think you may want to be doing z2 = np.dot(X, self.W1.T) instead of z2 = np.dot(X, self.W1) given your explanation of weights.

Plot command does not exist in Part2

The plot command does not seem to exist as written in Part2

https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/Part%202%20Forward%20Propagation.ipynb

Running Python 2.7

Something wrong with optimize.minimize

I have gone through to step 7 and I get this error when I try to run. the code is in the txt file

nn.txt

costFunction in Lecture 7

Hey everybode,

if J in the cost function is calculated like given in lecture 7

 J = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(self.W1**2)+sum(self.W2**2))

J ends up being a vector what causes in my believe the following error

File "C:\Program Files\Anaconda\lib\site-packages\scipy\optimize\linesearch.py", line 148, in scalar_search_wolfe1
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The problem is solved by adding another sum() to the calculation of J

    J2 = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(sum(self.W1**2))+sum(self.W2**2))

after that J is a scalar again.

Did you miss the sum() or did I miss something (and ended up doing something mathematically incorrent).

Best regards and thanks in advance!

3d modeling

fig = plt.figure() and ax = fig.gca(projection='3d') needs to be ax = plt.figure().add_subplot(projection='3d') part7 and also the other 3d models
#3D plot:
#Uncomment to plot out-of-notebook (you'll be able to rotate)
#%matplotlib qt

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection='3d')

#Scatter training examples:
ax.scatter(10X[:,0], 5X[:,1], 100*y, c='k', alpha = 1, s=30)

surf = ax.plot_surface(xx, yy, 100*allOutputs.reshape(100, 100),
cmap=cm.jet, alpha = 0.5)

ax.set_xlabel('Hours Sleep')
ax.set_ylabel('Hours Study')
ax.set_zlabel('Test Score')

No partSeven.py?

Just wanted to test out the final code for myself but didn't see a python file for the final version of the code.

Typo in filename: "Part 1 Data and Architechture.ipynb"

s/Architechture/Architecture/

typo Demysitifed -> Demystified

Something wrong with costfunction

Hi Stephen, i watched your video about this code, very good by the way,
but when i tryed to implement i've got this error:

Traceback (most recent call last):
in computeNumericalGradient
numgrad[p] = (loss2 - loss1) / (2*e)
ValueError: setting an array element with a sequence.

and when i forked your code, the error persists, the costfunction shoud return a number? or am i wrog?

thank's,
Felipe Melo

part three and part seven codes!?

I think I spotted an error in your excellent tutorial

Right after the line "We can now replace dyHat/dz3 with f prime of z 3." the formula shows an "dz3/dW3" on the left side which sould be a "dJ/dW2" in my oppinion.

Thank you again! Great work.

Dimension Error on Training the NN

Hi,

I tried to implement the same code, but changed the layers as shown:

    self.inputLayerSize = 3 
    self.outputLayerSize = 5 
    self.hiddenLayerSize = 5

This is because the data set shape is:
X = (4162, 3)
Y = (4162,)

However, after executing T.train(X, Y), I get the following error:

ValueError Traceback (most recent call last)
in ()
----> 1 T.train(X, Y)

in train(self, X, y)
26
27 options = {'maxiter': 200, 'disp' : True}
---> 28 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS', args=(X, y), options=options, callback=self.callbackF)
29
30 self.N.setParams(_res.x)

//anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.pyc in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
439 return _minimize_cg(fun, x0, args, jac, callback, **options)
440 elif meth == 'bfgs':
--> 441 return _minimize_bfgs(fun, x0, args, jac, callback, **options)
442 elif meth == 'newton-cg':
443 return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options)
845 else:
846 grad_calls, myfprime = wrap_function(fprime, args)
--> 847 gfk = myfprime(x0)
848 k = 0
849 N = len(x0)

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in function_wrapper(*wrapper_args)
287 def function_wrapper(wrapper_args):
288 ncalls[0] += 1
--> 289 return function((wrapper_args + args))
290
291 return ncalls, function_wrapper

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in derivative(self, x, *args)
69 return self.jac
70 else:
---> 71 self(x, *args)
72 return self.jac
73

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in call(self, x, *args)
61 def call(self, x, *args):
62 self.x = numpy.asarray(x).copy()
---> 63 fg = self.fun(x, *args)
64 self.jac = fg[1]
65 return fg[0]

in costFunctionWrapper(self, params, X, y)
10 def costFunctionWrapper(self, params, X, y):
11 self.N.setParams(params)
---> 12 cost = self.N.costFunction(X, y)
13 grad = self.N.computeGradients(X,y)
14

in costFunction(self, X, y)
29 #Compute cost for given X,y, use weights already stored in class.
30 self.yHat = self.forward(X)
---> 31 J = 0.5*np.sum((y-self.yHat)**2)
32 return J
33

ValueError: operands could not be broadcast together with shapes (4162,) (4162,5)

Would appreciate the help!

a probable mistake?

In "Part 4 Backpropagation.ipynb, the 'variables' table":
Dimension of Cost, J, should always be (1,1), instead of '(1, outputLayerSize)', right?

where is predict function?

Dear Stephen,

I learnt lots of things from your videos and pyhton codes. Thank you! However I could not find a predict function in your code. Is it missing? I think It would be nice to add this function.

Best,

Halil Agin.

ipython 4.0.0 loses "--pylab inline" functionality recommended in README.md

notebook --pylab inline
[E 11:11:32.373 NotebookApp] Support for specifying --pylab on the command line has been removed.
[E 11:11:32.373 NotebookApp] Please use `%pylab inline` or `%matplotlib inline` in the notebook itself.
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython notebookip
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython --version
4.0.0

... this is recommended in https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/README.md presumably to include e.g. numpy as np.

I believe this setup should also work:

cat ~/.ipython/profile_default/startup/go.py 
import numpy as np

Multiplying two matrices??

delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2)
This line appears in a function (forward) in the code. Whereby I noticed that z2 was a matrix (per your explanation). I am not that proficient in the python programming language so may not know what it means, but how does np.dot and * differ? The reason I ask this is because I discovered an anomaly in my code while trying to implement this by making a model with inputLayerSize=1, hiddenLayerSize=2 and outputLayerSize=1. Essentially, it should give me predictions for y for the equation y=2x (without knowing the equation firsthand, ofcourse). But when finding delta2 I am stuck because (W2) Transpose is of the order 1 x 2, del3 is of order 1 x 1 and z2 is of the order 1 x 2. How do I matrix multiply del3.W2 and sigmoidPrime(z2) ? (P.S. I was not implementing it in python) and thanks in advance.