License: Other

HTML 18.67% Ruby 0.15% CSS 0.22% Jupyter Notebook 28.74% TeX 48.46% SCSS 2.06% Python 1.71%

depthfirstlearning.com's Introduction

Depth First Learning

This is the open source code repository running Depth First Learning, powered by Jekyll and the Tale theme.

Start this locally with bundle exec jekyll serve.

Contributing

We're looking for people to help make DFL better. Here are some ways you can help:

Run a study group based on one of our guides, and let us know about it. What worked well? What didn't?
Check out the open issues, where we keep track of improvements that we'd like to see.
Tell us what our guides are missing, or where they may be confusing, by filing issues.
Improve our guides, by filing pull requests. We are especially looking for Colabs with lots of explanatory text to explain and reproduce results. All of the content is stored in markdown under the posts/ directory.
Add guides for other papers! Reach out to us at [email protected] and let's talk.

depthfirstlearning.com's People

Contributors

Stargazers

Watchers

depthfirstlearning.com's Issues

Additional materials/posts related to some of the common papers.

This looks like a great initiative, and it's something that was much needed.

But if you could also add the explanation/resources of some of the 'elementary' research papers (and by elementary I do not mean simple - rather the basic/foundation), like papers introducing CNNs or RNNs, or some regularization technique like dropout, it would be really helpful.

The reason for this is that if you have to learn about GAN (say a specific use-case of GAN like neural-style transfer), it's necessary that you should have the required knowledge regarding CNNs.

Although I know that explaining each and every paper out there in the domain of ML is not plausible, but maybe this could be done for some of the foundational papers.

Error in Problem Set 1 2e?

I think there may be an error in Problem Set 1 2e:

Specifically, x = phi(h), whereas the integrand term has phi(x). If this is a mistake, the problem is carried forward to

Make links in dependency graphs animate to the relevant anchor

The dependency graph is the thing that looks like this:

Note that these links live inside an IFRAME (that contains the SVG content), so it may take a bit of custom JavaScript.

Add curriculum for Neural Mechanics paper

I'd like to add a curriculum for the recent paper Neural mechanics: symmetry and broken conservation laws in deep learning dynamics. There is a lot of depth to this paper, and the payoff (being able to calculate exact training dynamics of certain parameter combinations) makes that worth studying. Like the Resurrecting the Sigmoid curriculum, I'd probably structure it to be mostly based around problem sets, except with more interactive (coding/jupyter notebook) problems. If this sounds good, I can get started.

Add more solutions?

Thanks for the exceptional depthfirstlearning material!

I am currently working through the InfoGAN material and put together a small repo with my solutions.
If the solutions are of interest for the InfoGAN page I could add them to your repo.

I would be happy contribute to this great project.

Kind regards
Michael

Small link issues

http://www.depthfirstlearning.com/2018/DeepStack
Noam Brown's same video is the 2nd optional reading in Part 1 and also the 2nd optional reading in Part 6

http://www.depthfirstlearning.com/2018/AlphaGoZero
Optional readings 3 and 4 in Part 1 are dead links.
Working links are respectively: https://www.chessprogramming.org/index.php?title=Minimax, https://www.chessprogramming.org/Alpha-Beta

Divergence is misspelled as Diverence in section 1 optional reading of INFOGAN

Robust Fill; Flashfill

Hi,
I am a CS undergrad. i would like to implement the following papers.
Please let me know the subjects that i need to know to understand it. Could you please help me ?

https://people.csail.mit.edu/rishabh/papers/cav15-ranking.pdf

https://www.semanticscholar.org/paper/RobustFill%3A-Neural-Program-Learning-under-Noisy-I%2FO-Devlin-Uesato/3ff0af64279929a952ee340e645256b7e0580f65

Thank you.

InfoGAN: Looking for good explanation for relationship between JS divergence, Jensen's inequality and Shannon Entropy

Why is Jensen-Shannon divergence called this way?

The answer is something like this: https://dit.readthedocs.io/en/latest/measures/divergences/jensen_shannon_divergence.html#derivation (where "x" is a convex combination of P and Q), and the expectation is taken over the binary variable defining which of {P,Q} to take

We'd like a clear write-up of this definitely of the JS divergence, including a proof of the equivalence between this definition and the others.

What is the difference with distill.pub?

It appears you are trying to achieve the same goals as https://distill.pub/. Why having a distinct platform?

Add DFL Logo to README

It looks pretty -- we should put it more places!

New paper suggestion for study: "A correspondence between Random neural networks and statistical field theory"

Hey,

as I am strong supporter of "Depth First Learning" and am very intrigued by the material, I wanted
to suggest to include a curriculum on "A correspondence between Random neural networks and statistical field theory" (Pennington, Schoenholz, 2017).

Even though there are some more recent developments on statistical field theory for neural networks (e.g. Helias, Dahmen, 2020 or Grosvenor, Jefferson, 2021) I think, that the paper of Pennington and Schoenholz provides rich opportunities to create a curriculum which covers not only NN and statistical field theory but also some basic introduction in to replica methods.

As the publication is in the realm of materials which I study for my PhD, I could design a curriculum.
Greetings
Javed (Lindner), (PhD student, RWTH Aachen University)

InfoGAN: Looking for a Colab implementing a GAN for MNIST, with both the saturating and non-saturating GAN loss”

(Just a GAN implementation, no InfoGAN).

I'd like to show the specific differences in training for the two losses described in the original GAN paper.

A good Colab would include an abundance of text cells explaining exactly what each part is doing. You can put math in text cells, followed by TensorFlow code implementing that math.

Vanishing gradients of GANs?

I am wrapping my head around the explanation of the vanishing gradients problem of GANs for quite some time:

The current solution pdf document plots the input of the sigma functions over the loss function values to explain the (non-)saturating behavior. However, I am asking myself if that plot just captures the saturation of the sigma function and not the saturation behavior of the G loss function itself.

The NIPS 2016 GAN tutorial shows in figure 16 (p.26) an explanation of the saturating loss without taking the sigma function into account. With this explanation, I guess, the saturation behavior is explained through the gradients for G when G is not (yet) able to generate good fakes and D can easily identify them as fake (x = 0 or close to 0).
See a plot of the saturating and non-saturating loss function and their derivations. There, the saturating loss has a small gradient of around -1 and the saturating loss of -infinity at x = 0.
When I plot the gradients over the training for both loss functions I also get higher gradient means and higher standard deviations for the non-saturating loss when compared to the saturating loss (see notebook).

Maybe I am missing something?

I would be happy if somebody could point me in the right direction.

I found (I think) an alternative source for 1) here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.3854&rep=rep1&type=pdf, but the rest I didn't manage. There could also be more missing links.

Cheers,

James

depthfirstlearning / depthfirstlearning.com Goto Github PK

depthfirstlearning.com's Introduction

Depth First Learning

Contributing

depthfirstlearning.com's People

Contributors

Stargazers

Watchers

Forkers

depthfirstlearning.com's Issues

Recommend Projects

Recommend Topics

Recommend Org