Giter Club home page Giter Club logo

charrnn's Introduction

RNN Character Level Language Model

license dep1 dep2 dep3
Language Model based on The Unreasonable Effectiveness of Recurrent Neural Networks from Andrej Karapathy's blog.
General Tensorflow implementation of LSTM based Character Level Language Model to model the probability distribution of the next character in the sequence given a sequence of previous characters.
charrnn The above image is taken from the mentioned blog.
For the complete details of the dataset, preprocessing, network architecture and implementation, refer to this Wiki.

Requirements

What's Interesting

This implementation will,

  • provide support for arbitrary length input sequences by training the Recurent Network using Truncated Backpropagation Through Time (TBPTT). It reduces the problem of vanishing gradients for very long input sequences.

  • provide support for stacked LSTM layers with residual connections for efficient training of the network.

  • provide support for introducing different types of *random mutations in the input sequence for simulating real world data like,

    1. dropping characters in the input sequence
    2. introducing additional white spaces between two words
  • the input pipeline is based on Tensorflow primitive readers and queuerunners which prefetch the data making training upto 1.5-2X faster on hardware accelarators. Prefetching data reduces the total stall time of the hardware accelarators thus making their efficient use.

*Random mutations in the input sequence improve the robustness of the trained model against real world data.

Implementation

  • tf.train.SequenceExample for storing and reading input sequence lengths of arbitrary length

  • tf.contrib.training.batch_sequences_with_states for splitting and batching input sequences for TBPTT while maintaining the state of the recurrent network for each input example

  • tf.nn.dynamic_rnn for dynamic unrolling of each input example upto its actual length and not for the padding at the end. This is more correctness than for efficiency

charrnn's People

Contributors

samre12 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

slowmickey

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.