Giter Club home page Giter Club logo

peterroelants.github.io's People

Contributors

peterroelants avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

peterroelants.github.io's Issues

Wrong sigma for functions

. This softmax function $\varsigma$ takes as input a $C$-dimensional vector $\mathbf{z}$ and outputs a $C$-dimensional vector $\mathbf{y}$ of real values between $0$ and $1$. This function is a normalized exponential and is defined as:

Greek letter ς is only used as last letter in words that end with an s sound (weird rule I know). In math
to denote functions we use regular sigma σ (in mathml \sigma).

Error in GP example with noise

I believe there is an error in the GP example with noise. More precisely, in:
Σ11 = kernel_func(X1, X1) + σ_noise * np.eye(n1)
one should be adding sigma**2, because it is covariance matrix.

RNN part1

Hi, I think there is a mistake in the equation ds_k/ds_{k-m}... in the last factor you have ds_{k-m+1}/ds_{k-1} but I think it should be ds_{k-m+1}/ds_{k-m}

Clarification with notes on "Understanding Gaussian Processes"

Hi Peter, first thanks so much for putting your notes on machine learning online - I found the article "Understanding Gaussian processes" particularly rigorous and helpful.

Can I please clarify two things in that particular post?

  1. In the section "Predictions from posterior", can I please verify if the computations for the conditional distribution is correct? Specifically,

\mu_{2 | 1} &= \mu_{2}+ \Sigma_{21} \Sigma_{11}^{-1}\left(\mathbf{y}{1}-\mu{1}\right)

\Sigma_{2 | 1}=\Sigma_{22}-\Sigma_{21} \Sigma_{1}^{-1} \Sigma_{12}

should be

\mu_{2 | 1} &=\mu_{2}+ \Sigma_{12} \Sigma_{22}^{-1}\left(\mathbf{y}{1}-\mu{1}\right)

\Sigma_{2 | 1}=\Sigma_{22}-\Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21}

I've derived the computation based on your post on conditional distribution here.

  1. In the section "Predictions from posterior", you stated that "Keep in mind that y_1 and y_2 are jointly Gaussian since they both should come from the same function. Can I please clarify that "same function" means that both y_1 and y_2 came from the same gaussian distribution over functions f(x)?

Thanks for your time!

Neural Network Intercept Bias

I am going through a NN tutorial from this website

I am confused about one particular paragraph in this page (screenshot below).

enter image description here

  1. Is the choice of the intercept bias of -1 purely arbitrary? I don't quite understand his explanation.

  2. It said in the screenshot that the RBF function maps all values to a range of [0, +infinity]. However, the RBF function only maps to a range of [0,1]. Is this a mistake? And how does this positive range lead to a choice of -1 intercept bias?

From: http://stackoverflow.com/q/41989488/1387612

Link to the IPYNB notebook mentioned in https://peterroelants.github.io/posts/rnn-implementation-part02/ is broken

Problem when running gaussian-process-kernel-fitting.ipynb

Thank you very much for the great repo!
When I try to run the code in the notebook "gaussian-process-kernel-fitting.ipynb" in the "Tuning the hyperparameters" section I get the exception
"RuntimeError: loss passed to Optimizer.compute_gradients should be a function when eager execution is enabled."

It seems to be related to tensorflow version, but I could not solve it myself.
I tried the solution mention here:
https://stackoverflow.com/questions/57858219/loss-passed-to-optimizer-compute-gradients-should-be-a-function-when-eager-exe
However, that creates another problem.

my evironment is running under:
python 3.6.9
tensorflow==2.1.0
tensorflow-estimator==2.1.0
tensorflow-probability==0.9.0

convert ipython notebook to jekyll pages

This is not an issue per se, but I am wondering what method you used to convert ipython notebook to a github pages. Can you briefly shared your experience?

Thank you. This is great project.

Gaussian process tutorial notebook not working on local machine

Heya!

I tried to run the GP tutorial notebook on my local machine, but got the following error pop up:

NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_type": "m...')

The other notebooks in the directory work just fine. Tried to run the JSON through an online validator and it passed. Any ideas?

Thanks!

Part 1: weights diverge when using more input samples

First, thank you for these articles!
However, when playing with the code, if I changed the number of input samples to 40 I get this result:

w(0): 0.1000     cost: 46.1816
w(1): 4.7754     cost: 92.1105
w(2): -1.8647    cost: 184.7509
w(3): 7.5657     cost: 371.6103
w(4): -5.8276    cost: 748.5129

I solved this by using a learning rate inversely proportional to the number of samples, i.e.
learning_rate = 2 / nb_of samples
instead of a fixed 0.1.

I tested it with sample sizes from 5 to 10 million, and it seems to always converge now.
I don't know if this makes any mathematical sense, just want to let you know.

missing terms in partial derivatives?

Peter,

Thank you so much for the great RNN tutorial post. This might seem long, but it is very quick.

1 - For Part 1, you defined the states array S to be 1x1. How will your example change if one decided to use 2 hidden states for example. The clear final solution is that one of them will be turned off, but how would you define it?. In this case your wRec will be 2x1 right?

  • In the example you provided for this section, you assumes that the weight between the last state and the final output is already given and it equals 1, right? since all the RNN exmaples talks about Wx, Wrec, and Wy that goes from hidden to output.
  • Also, how would you arrange the data if you want multi-dimensional input and multi-dimensional output at the same time? For example, each time step has a vector input and a vector output.

2- In the same part - section “Compute the gradients with the backward step”; you explain BPTT briefly, and it is not clear to me how you came up with the partial derivatives. I worked out a small 3 time steps example.

Questions:

  • Why your summation starts from 0?
  • Why there is dc/dSk ? , dc : “partial for cost”
  • I found that there is Wrec in the derivatives. Did you miss that or is it included somewhere?

My Example,

dc/dwx = dc/dy * dy/wx
dc/dy = 2(y - t)

but y in this example is nothing but (S2 * 1), so:

y = S3
y = x3 * wx + S2_Wrec ,… substitute for S2
y = x3 * wx + (x2_ wx + S1* Wrec )* Wrec , …. expand
y = x3 * wx + x2* wx * Wrec + S1* Wrec^2 ,…. Substitute for S1
y = x3 * wx + x2* wx * Wrec + (x1_wx + S0 * Wrec )_ Wrec^2 ,… expand
y = x3 * wx + x2* wx * Wrec + x1*wx * Wrec^2 + S0 * Wrec^3

then,

dy/dwx = x3 + x2 * Wrec + x1 * Wrec^2
= sum (xi * Wrec^(i-1)) where i = {1,2,3}

Best,
-M

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.