dennybritz / deeplearning-papernotes Goto Github PK
View Code? Open in Web Editor NEWSummaries and notes on Deep Learning research papers
Summaries and notes on Deep Learning research papers
Hi, Denny! I have been following your notes for a long time since I was a college students.
Hope I can see some new thoughts from you!
Hi Denny,
Thank you for your interest on our reconstruction work. I am the first author of this paper, and try to answer your questions.
Adequacy and/or Fluency evaluations are regularly employed for assessing the quality of machine translation. Adequacy measures how much of the meaning expressed in the source is also expressed in the target translation. It is well known that NMT favors fluent but inadequate translations, which have not only the coverage problems (e.g., over-translation and under-translation) but also mis-translated (e.g., wrong sense and unusual usage) and spurious translation (i.e., translation segments without any reference in source) problems.
As shown in Table 1, likelihood with length normalization favors long translations, which may face over-translation problems.
We try to show that the increase of translation performance is indeed due to the improvement of reconstruction over time. We care more about the improvement of translation performance, since it is the ultimate goal of NMT.
During the training, we validate the translation performance for every 10K iterations. We select models that yield best performances on the validation set for NMT models. This is a standard procedure to select a well trained model in NMT.
Again, we care more about the ultimate goal of NMT -- translation performance measured in BLEU scores.
We only test on the Chinese-English translation task, which uses the same data as in our previous work Modeling Coverage for Neural Machine Translation and Context Gates for Neural Machine Translation.
Really good point! There should be a better way to model the reconstruction (e.g., focus only on the wrong part). We will study on this in the future.
MLE favors fluent but inadequate translations, thus is not a optimal metric for NMT. It is necessary to introduce a better objective, such as sentence-level BLEU (Shen et al., 2016), and an auxiliary objective of reconstruction (in this work) or coverage penalty (in GNMT paper).
Could it be useful to have also the code used in the paper, when provided by the authors, next to the paper link? What do you think about a format of this type for the link's variable name: webRepository_[framework/library|Language]_model/packageName?
(e.g. tf-seq2seq link variable will be: [GH_TF_Tf-Seq2Seq] )
Or if you prefer something in the KISS way: a simple code
as variable name will be perfect.
These notes would be well suited to shortscience, the goal on that site being to provide concise summaries of papers in markdown.
I have read the paper, and I can not understand how model2 works, you said you thought you understand what it does, so could you please give me more explanations on the cond. model? Thanks very much.
Thanks for this useful summary page. Have you considered providing also links to the relevant GitXiv entries, when available?
If you are not familiar with GitXiv, have a look: http://www.gitxiv.com/
For example, here is the WaveNet entry: http://www.gitxiv.com/posts/W5ax2LzW9Ht3mtTk2/wavenet-a-generative-model-for-raw-audio
Thanks again!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.