Giter Club home page Giter Club logo

recurrent_pytorch's Introduction

recurrent_pytorch

This repository is destined to hold code for my RNN experiments in Pytorch (which will be learned in parallel), to be used for weird experiment ideas and eventually blog posts/papers/garbage.

To-do:

  • Install Pytorch at home
  • Write input pipeline to shuffle pixels with a fixed random pattern
  • Write basic RNN
    • Make initial state learnable
    • Use layer norm for stability
  • Write GRU
  • Add zoneout
  • Baseline experiments with RNN & GRU
    • Write function for plotting training curves during training
    • Figure out how to save model checkpoints to enable reloading trained model
  • Post-training analysis of saved model
    • Write function for visualizing the important pixels used for classification
    • Write function to save collage of images w/ arbitrary printout below
      • Make collages of images with lowest and highest loss
    • Write function to generate gif to visualize test case:
      • On the left: the image appears one pixel at a time
      • On the right: the network outputs as a bar chart with the correct label highlighted

Miscellaneous ideas:

  • Can we learn intermediate labels (i.e. half-way through the sequence, something that starts to resemble true label) for additional supervision?
    • add fully-connected net to map intermediate hidden states to final states
    • use output weights from rnn to produce soft intermediate labels from the predicted final states for each intermediate hidden state
  • Add minimalRNN (https://arxiv.org/abs/1711.06788)
  • Active learning in noisy datasets where the model is allowed to reject up to a certain percentage of the training examples after seeing the loss (need to add corrupted labels)
    • Separate meta net takes model output and loss, produces reject probability
    • Start with high temperature (exploration) and reduce as learning progresses
  • Are recurrent nets more or less susceptible to adversarial examples than convolutional nets?
    • Note: input is lower dimensionality, but recurrence may lead to exploitable instabilities (could go either way)
  • Add aleatoric uncertainty - [ ] Backprop uncertainty to pixels and highlight highly-confusing pixels in validation examples with highest uncertainty
  • Surprise gate: the network predicts the next input from its hidden state & gives higher value to less predictable inputs
  • Figure out how to obtain/plot information theoretic results:
    • Mutual information between hidden state and input/output/next hidden state
    • Does mutual information track with weight correlation/symmetry (between the input/output/recurrent weight vectors for each unit)?
    • Track redundancies in the hidden units throughout training, where redundancies are defined as 2*H(o|h_i,h_j)-H(o|h_i)-H(o|h_j) or maybe I(h_i;h_j)
    • How does the mutual information between hidden unit activations relate to the Hessian? Can we calculate the Hessian in the case where n_hidden is small? Measure "how diagonal" the Hessian is and compare to: redundancies, long-term memory, training speed
  • Explore multiplicative terms and hypernets...
    • Implement hyperRNN and multiplicative RNN
    • Plot gradient stats and redundancies throughout training, then try to figure out what happened
  • Experiment with evolutionary strategies for training
  • Experiment with decoupled neural interfaces

recurrent_pytorch's People

Contributors

jrbtaylor avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.