Giter Club home page Giter Club logo

my_seq2seq's Introduction

Seq2Seq models

This is a project to learn to implement different s2s model on tensorflow.

This project is only used for learning, which means it will contain many bugs. I suggest to use nmt project to do experiments and train seq2seq models. You can find it in the reference part.

Experiments

I am experimenting the copynet and pg on lcsts dataset, you can find the code in the lcsts branch.

Issues and suggestions are welcomed.

Models

The models I have implemented are as following:

  • Basic seq2seq model
    • A model with bi-direction RNN encdoer and attention mechanism
  • Seq2seq model
    • Same as basic model, but using tf.data pipeline to process input data
  • GNMT model
    • Residual conection and attention same as GNMT model to speed up training
    • refer to GNMT for more details
  • Pointer-Generator model
  • CopyNet model
    • A model also support copy mechanism
    • refer to CopyNet for more details.

For the implement details, refer to ReadMe in the model folder.

Structure

A typical sequence to sequence(seq2seq) model contains an encoder, an decoder and an attetion structure. Tensorflow provide many useful apis to implement a seq2seq model, usually you will need belowing apis:

  • tf.contrib.rnn
    • Different RNNs
  • tf.contrib.seq2seq
    • Provided different attention mechanism and also a good implementation of beam search
  • tf.data
    • data preproces pipeline apis
  • Other apis you need to build and train a model

Encoder

Use either:

  • Multi-layer rnn
    • use the last state of the last layer rnn as the initial decode state
  • Bi-direction rnn
    • use a Dense layer to convert the fw and bw state to the initial decode state
  • GNMT encoder
    • a bidirection rnn + serveral rnn with residual conection

Decoder

  • Use multi-layer rnn, and set the inital state of each layer to initial decode state
  • GNMT decoder
    • only apply attention to the bottom layer of decoder, so we can utilize multi gpus during training

Attention

  • Bahdanau
  • Luong

Metrics

Right now I only have cross entropy loss. Will add following metrics:

  • bleu
    • for translation problems
  • rouge
    • for summarization problems

Dependency

  • Using tf-1.4
  • Python 3

Run

Run the model on a toy dataset, ie. reverse the sequence

train:

python -m bin.toy_train

inference:

python -m bin.toy_inference

Also you can run on en-vi dataset, refer to en_vietnam_train.py in bin for more details.

You can find more training scripts in bin directory.

Reference

Thanks to following resources:

my_seq2seq's People

Contributors

xueyouluo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.