deepgenerativemodels / notes Goto Github PK

View Code? Open in Web Editor NEW

589.0 589.0 128.0 38.1 MB

Course notes

License: MIT License

HTML 5.18% Ruby 6.19% CSS 60.29% Makefile 0.39% TeX 27.96%

notes's People

Contributors

Stargazers

Watchers

Forkers

ruishu machinelp kristychoi jiamings zaidnabulsi andreapi fendaq allensmile andrewk1 wowml mehrad0711 polimorfo scheeloong acganesh vandith yanxiangyi stevenfyy georgosgeorgos kelvinson cqduan hhy5277 tanmdl jmsung nav13n kydonian nikkibytes cooper-mj josegironn loodvn msalvato circlemarker chloroaqua daylen kitliu5 cj401 dragomirradev jonathanjmak aviralksingh habibmrad cwickniss chuanqichen zhaosongyi kulkarnisp johnreid pierrenowi mbrukman happygaoxiao data-science-ai-open-source vin136 weiqiao geophysics-ai sniper-lai varuni-d freedreamer-crypto sumanyu-21 blipblipgo youweiliang pablogonzalezginestet sean0719 lucazanottifragonara akbek purushothamansrikanth ai-hub-deep-learning-fundamental minedeep burantiar harry-zhou drugintelligence liufengbrain robz phenometry anhnguyendepocen spyroot towardsautonomy mriedman barvin04 amirziai juampamuc ahmedhshahin spatialxia gkutiel gauravmunjal13 ivillar grandeep wedodo55 mrzzy2021 yijiejin xrfxlp 3ephn3zxdr5c stanleyjacob salman-lui arshahin jaedukseo indigoyeoma ml-edu eternalalloy wangyuanhao neerajgoel82 uaicfs jidhnyasa aimeddrug

notes's Issues

Sharing homework

I have been watching this course on youtube, and it's really really great. However, I can't find any homeworks on the site. Is it possible for you to share homework assignments?

Thanks

Missing reference in https://deepgenerativemodels.github.io/notes/vae/

In https://deepgenerativemodels.github.io/notes/vae/, paragraph

Learning Directed Latent Variable Models

states that

As we have seen previously, optimizing an empirical estimate of the KL divergence is equivalent to maximizing the marginal log-likelihood logp(x) over $D$

This isn't mentioned anywere in the rest of the course notes, . It would be useful for the learner to add the proof of this equivalence, or at least a reference to it.

Missing Images in Normalizing Flow

Latest compiled HTML is inconsistent with markdown

The compiled HTML that is live here (https://deepgenerativemodels.github.io/notes/introduction/) still contains errors like "our goal is to learn the paraemeters" (misspelling), but this is not present in the markdown. We should compile the markdown to update this.

Confused by qλ(z)

Reading the following paragraphs in the VAE notes:

Next, we introduce a variational family Q of distributions that approximate the true, but intractable posterior p(z∣x). Further henceforth, we will assume a parameteric setting where any distribution in the model family Px,z is specified via a set of parameters θ∈Θ and distributions in the variational family Q are specified via a set of parameters λ∈Λ.
Given Px,z and Q, we note that the following relationships hold true1 for any x and all variational distributions qλ(z)∈Q

If "qλ(z)" is intended to approximate the distribution "p(z∣x)", then I'm confused as to why "qλ(z)" doesn't include "x". Should it be "qλ(z∣x)", or it is actually approximating the distribution "p(z)"?

Apologies if this sounds like an ignorant question--my understanding of probability notation isn't too sharp.

The number of parameters needed to specify a table of a Bayesian network

This is from Autoregressive Models Chapter

To see why, let us consider the conditional for the last dimension, given by $p(x_n|x_{\lt n})$. In order to fully specify this conditional, we need to specify a probability for $2^{n−1}$ configurations of the variables $x_1,x_2,\ldots,x_{n−1}$. Since the probabilities should sum to $1$, the total number of parameters for specifying this conditional is given by $2^{n−1}−1$. Hence, a tabular representation for the conditionals is impractical for learning the joint distribution factorized via chain rule.

Shouldn't it be $2^{n-1}$ instead of $2^{n-1}-1$ here ? Why minus $1$ ? In my understanding, the $n$-th random variable is dependent on $n-1$ random variables, in binary case, there should be $2^{n-1}$ rows in the table. In every single row, the entries should add up to $1$, so only one of the two entries in this row needs specifying. Thus one parameter for each row, it should be $2^{n-1}$.

Possible errors on estimating gradient base on REINFORCE

In the variational auto-encoder chapter, the gradient of the encoder is computed as:

However the term also depends on the parameter \lambda. According to the formula (4) in the "Gradient Estimation Using Stochastic Computation Graphs, NIPS 2015"

When estimating the encoder's gradient, the first gradient term is missing according to the equation above

Confused by λ(i)

In the "Black-Box Variational Inference" section of the VAE notes:

We first do per-sample optimization of q by iteratively applying the update
λ(i) ← λ(i) + ∇̃ λ ELBO(x(i); θ, λ(i))
We then perform a single update step based on the mini-batch
θ ← θ + ∇̃ θ ∑i ELBO(x(i); θ, λ(i))

If I understood correctly, x(i) is the ith sample from a batch B of the dataset D, and λ is a vector of parameters of the distribution "qλ(z)". What is λ(i)?

Is it the ith parameter of λ? That would imply that the length of B is equal to the dimension of the λ--if so, it's unclear to me why they would be equal.

Another possibility is that λ(i) is the ith update to λ. If so, perhaps it would be better rewritten like this:

λ(i+1) ← λ(i) + ∇̃ λ ELBO(x(i); θ, λ(i))

But if that's the case, then it's unclear to me why it appears in the θ update:

θ ← θ + ∇̃ θ ∑i ELBO(x(i); θ, λ(i))

Apologies if I've missed something obvious here. Also, thanks for notes--they've been very helpful!

deepgenerativemodels / notes Goto Github PK

notes's People

Contributors

Stargazers

Watchers

Forkers

notes's Issues

Sharing homework

Missing reference in https://deepgenerativemodels.github.io/notes/vae/

Learning Directed Latent Variable Models

Missing Images in Normalizing Flow

Latest compiled HTML is inconsistent with markdown

Confused by qλ(z)

The number of parameters needed to specify a table of a Bayesian network

Possible errors on estimating gradient base on REINFORCE

Confused by λ(i)

Some typos in https://deepgenerativemodels.github.io/notes/autoregressive/

Are there any way to watch videos for non Stanford students?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent