Giter Club home page Giter Club logo

mimick's People

Contributors

dhgarrette avatar kelseyball avatar muralibalusu12 avatar ruyimarone avatar yuvalpinter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimick's Issues

Code runs very slow on GPU

I ran a mimick algorithm on a small data set and it is taking 5 mins on CPU. But when I run the same on GPU it is taking 40 mins to finish one epoch. Is there a way to fix this?

Can we increase the batch size of this?

Transformer Models

Hi, is it possible to integrate it with transformer-based models, such as a variation of BERT?

params in MomentumSGDTrainer

Hi I was trying out your demo when I run into error at line 166, Mimick/mimick/model.py
trainer = dy.MomentumSGDTrainer(model.model, options.learning_rate, 0.9, 0.1)

The error message shows that MomentumSGDTrainer takes 3 parameters, as in
MomentumSGDTrainer(ParameterCollection &m, real learning_rate = 0.01, real mom = 0.9)

wondering is there a version conflict? But I installed the v2.0 dynet, following your README.

So what is this last parameter 0.1? Do I just simply delete it?

Thanks in advance!

Call `initial_state()` on all levels of BiLSTM

Currently, all of the dynet code for LSTMs in the tagging task code (model.py) is not making use of the initial_state() method. This entails:

  • Word-level LSTM states are not reset between sentences (although the computation graph is renewed so there's no trans-backprop).
  • Char-level LSTM in char2tag or both mode keeps its state across words along the entire dataset. Within sentences, this means there is also backprop across word boundaries since there's no call to renew_cg(). This effect may be insignificant due to the <PAD> characters, but I don't know for sure.

Compatibility with Python 3

Finally got round to experimenting with Mimick only to discover that it targets Python 2 only. (Insert rant that Python 3 is already a decade old.) Do you by any chance plan to add support for Python 3?

Thanks!

in_vocab count is set to zero

The variable in_vocab is set to zero on line num. 80 in the make_dataset.py file. As a result, when there are oov words in the vocab file, the words in training count is always zero in the output.

Find best Trainer

Since the upgrade to DyNet 2.0, training loss doesn't seem to converge on the Mimick algorithm (fine in tagger code; models also make sense).

This seems to be due to the change in learning rate behavior in DyNet's trainers. The current implementation here uses AdamTrainer, but SGDTrainer and AdaGradTrainer have the same issues.

Char2Tag takes wrong representations from backward LSTM

This line should not be concatenating char_embs[-1], but rather dy.concatenate([char_embs[-1][:h],char_embs[0][h:]]) for the appropriate h.

The in-model Mimick code is fine, since it uses separate fwd and bwd char-level models rather than dynet's built-in BiRNNBuilder. The word-level BiLSTM is fine because it's sequence prediction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.