Giter Club home page Giter Club logo

edu's Issues

Fix typos in autoencoder-mnist.ipynb

  1. Choosing an output activate
    • Review the logged outputs of your neural network in the Weights & Biases interface and look for issues caused by this choice of activation.
    • For example, here they are always between 0 and 1, to match the data they are compared against, while while activation values have no such restriction.
  2. Change Hyperparameters
    • Better results can be obtained by tweaking the hyperparameters.
  3. Challenge: Regularization and Learned Weights
    • With the default settings, the leanred filters don't look like that at all.

https://github.com/wandb/edu/blob/4a438532a4cd53a025c348ecabed93bcc5dee646/lightning/autoencoder/autoencoder-mnist.ipynb

Add more autograded exercises to calculus notebook

Ideas

  1. Little-o exercises. They'd be multiple choice, basically. And/or: provide a function that is little-o of x^2? I can check this with sympy.
  2. Linearity of the gradient approximation. Write a function to compute gradient approx, given the function and its gradient. Compare to finite differences?

rethink the logging code from the LoggedLitModule

This was written when I had half as much experience with Lighting as I have now, and before the most recent integration.

I should rethink it, with emphasis on the following:

  • moving material into callbacks
  • cutting down on the magic parameter inference
  • improve robustness (see #25)
  • reorganize and make hierarchical (see #20)
  • writing some tests, including e2e tests that write runs to wandb and pull metrics

run Colab-specific installs only on colab

right now, the installs are run regardless of environment, but they should only be run in Colab.

just need to move the !pip install commands into the appropriate if branch and then apply %%capture to the cell

Consider creating a nn utilities file

On the one hand, it will reduce code duplication across Colabs, but on the other, it really needs to be done well if it's going to be used everywhere -- have to make sure it's e.g. DDP compatible, the logging could be done better (more callbacks?), and want to use PL best practices as much as possible.

Organize Math4ML into folders

Adding more content and at the same time de-emphasizing existing content, but we don't want to delete that content, so let's make a nested structure with folders and subfolders

-- 00_{topic}/
  |  exercises.ipynb
  --  extras/
    |  {subtopic}.ipynb
-- 01_{topic}/ 

Better model saving for PyTorch

Caught on a dilemma with saving PyTorch models for viewing in Netron

  • OT1H, just saving as a .pt file results in unreliable performance by Netron, who can't really be expected to handle all the possible choices in both major libraries, and so prefers ONNX and lets them handle the conversion, but
  • OTOH, ONNX can't handle e.g. the AdapativeAvgPool2d layer, important for CNNs that are easy to play with. Seems like a pretty fundamental limitation w.r.t adaptive layers.

In particular, for the MLP and CNN in PyTorch, I want to emphasize reusability and enable easy extension, and so I'm using ModuleLists and custom Modules. Neither is really gonna play nicely with Netron.

For now, I'm sticking with .pt files, which save but aren't visualized well (the ModuleLists aren't expanded, and that's where all the action is!).

Make extras compatible with Colab

Would like to drop reliance on the W&B Hub, and Colab seems like the least-bad choice.

Options:

  1. Local. No lol. Installation issues are a dealbreaker.
  2. Binder. Ephemerality will be frustrating for students, and is a dealbreaker for classes where the HW is done between sessions.
  3. Colab. Avoids ephemerality. The limitation to the matplotlib inline backend for interactive charts is frustrating but not a showstopper.
  4. Gradient. Requires a login, possibly expensive, unclear if better, GPUs not needed for this class.

Deduplicate projects

Architecture search was put under cnn/ when it fits much better under projects/. For now it's duplicated, but it should only be in projects/.

There are currently two versions of the FER project, but there should only be one. Once the more robust NN utils are in place (#20, #25, #26) can reduce.

improve the robustness of the utils

Networks that are being pruned with torch.nn.prune break certain assumptions in e.g. the parameter counting, as do I believe the quantized networks.

These should be resolved (incorporating fixes from the relevant notebooks, when possible) so that the utils are more robust.

Make a better test for the array_is_pmf question

Right now, doesn't appropriately test whether there are values above 1 or below 0 because the examples don't add up to 1, which is what most folks test for.

np.array([-1, 2]) and/or np.array([-0.5, 1.1, 0.2, 0.2]) would do it.

fix mixed 2-space and 4-space indentation

Colab defaults to 2-space but much of the code is in 4-space -- and other Jupyter instances don't like 2-space indentation

In calculus exercises: is_little_o, identity, constant

Move SVD material

The SVD material is interesting, but hard to make concrete and compelling with the constraints we have (unless I come up with a slick "LA-as-programming" explanation of kernels and maybe also eigenvalues, which is tougher).

I should move it into a separate notebook.

Better DataLoader practices

should make the dataloaders more configurable for the AbstractMNISTDataModule -- can probably fix pin_memory to True, but should allow configuration of num_workers (with a default of 2 or nproc, depending on how far we want to go)

Add more "Linear Algebra as Programming" Exercises

This is the core idea of the lecture slides, but there aren't enough exercises for it. They require a certain amount of creativity, but here's a few possibilities:

  1. Shapes and types. Debug a shape issue? Write code that checks a shape, by analogy with checking a type?
  2. Batch dimensions/broadcasting and for loops. Write a for loop that does the same thing as a broadcasted multiply. Make it a "batch" application.
  3. Convolutions and zips. Harder: write a zip that's equivalent to a convolution.
  4. Parallel composition. Use concat/stack operations to implement applying two "functions" to the same input. Something else with the Kronecker product?
  5. repeat. Use matrix multiplication (an outer product?) to copy the input k times.

add more exercises to probability nb

Ideas:

  1. PMFs and PDFs. Dictionaries versus densities.
  2. Softmax. Define a softmax function for a PMF.
  3. Cross entropy loss. Implement it, based on a formula? Based on PMF or PDF?

Add implementing gradient descent to the calclulus nb

Given the gradient and parameters, apply one gradient descent step and return the new parameters.

Check that:

  • Stationary at a maximum and a minimum
  • Stationary if lr is 0
  • Optimizes quadratic in one step with the right lr
  • Works on vector-valued versus scalar-valued inputs?

fix Binder compatibility

The Binder setup needs to be tweaked now that we're in v2. The Dockerfile is no longer in the right place, which is going to be a PITA to fix. That Dockerfile also needs to be updated.

move MLP utilities to Lightning

The utils I wrote for the lightning/mlp/ notebooks are useful more broadly in the lightning material.

They should perhaps be moved up to the lightning/ folder. This will require changing some code in the extant lightning colabs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.