Giter Club home page Giter Club logo

deeplearning-papernotes's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearning-papernotes's Issues

Hope updates are coming

Hi, Denny! I have been following your notes for a long time since I was a college students.
Hope I can see some new thoughts from you!

Response to notes on ``Neural Machine Translation with Reconstruction''

Hi Denny,

Thank you for your interest on our reconstruction work. I am the first author of this paper, and try to answer your questions.

  • I feel like "adequacy" is a somewhat strange description of what the authors try to optimize. Wouldn't "coverage" be more appropriate?

Adequacy and/or Fluency evaluations are regularly employed for assessing the quality of machine translation. Adequacy measures how much of the meaning expressed in the source is also expressed in the target translation. It is well known that NMT favors fluent but inadequate translations, which have not only the coverage problems (e.g., over-translation and under-translation) but also mis-translated (e.g., wrong sense and unusual usage) and spurious translation (i.e., translation segments without any reference in source) problems.

  • In Table 1, why does BLEU score still decrease when length normalization is applied? The authors don't go into detail on this.

As shown in Table 1, likelihood with length normalization favors long translations, which may face over-translation problems.

  • The training curves are a bit confusing/missing. I would've liked to see a standard training curve that shows the MLE objective loss and the finetuning with reconstruction objective side-by-side.

We try to show that the increase of translation performance is indeed due to the improvement of reconstruction over time. We care more about the improvement of translation performance, since it is the ultimate goal of NMT.

  • The training procedure somewhat confusing. The say "We further train the model for 10 epochs" with reconstruction objective, byt then "we use a trained model at iteration 110k". I'm assuming they do early-stopping at 110k * 80 = 8.8M steps. Again, would've liked to see the loss curves for this, not just BLEU curves.

During the training, we validate the translation performance for every 10K iterations. We select models that yield best performances on the validation set for NMT models. This is a standard procedure to select a well trained model in NMT.

Again, we care more about the ultimate goal of NMT -- translation performance measured in BLEU scores.

  • I would've liked to see model performance on more "standard" NMT datasets like EN-FR and EN-DE, etc.

We only test on the Chinese-English translation task, which uses the same data as in our previous work Modeling Coverage for Neural Machine Translation and Context Gates for Neural Machine Translation.

  • Is there perhaps a smarter way to do reconstruction iteratively by looking at what's missing from the reconstructed output? Trainig with reconstructor with MLE has some of the same drawbacks as training standard enc-dec with MLE and teacher forcing.

Really good point! There should be a better way to model the reconstruction (e.g., focus only on the wrong part). We will study on this in the future.

MLE favors fluent but inadequate translations, thus is not a optimal metric for NMT. It is necessary to introduce a better objective, such as sentence-level BLEU (Shen et al., 2016), and an auxiliary objective of reconstruction (in this work) or coverage penalty (in GNMT paper).

Add link to code used in the paper

Could it be useful to have also the code used in the paper, when provided by the authors, next to the paper link? What do you think about a format of this type for the link's variable name: webRepository_[framework/library|Language]_model/packageName?
(e.g. tf-seq2seq link variable will be: [GH_TF_Tf-Seq2Seq] )
Or if you prefer something in the KISS way: a simple code as variable name will be perfect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.