Giter Club home page Giter Club logo

cot's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cot's Issues

TypeError: reduce_max() got an unexpected keyword argument 'keepdims'

Hi ,
Try to python cot.py
File "F:\study\EBGAN\CoT\Cooperative-Training\generator.py", line 57, in _g_recurrence
T_t = tf.stop_gradient(tf.reduce_max(-log_prob, axis=-1, keepdims=True))

TypeError: reduce_max() got an unexpected keyword argument 'keepdims'

Thonk you for your help

tf1.6.0 not working, and did not get good NLL oracle result.

Hi,

First I tried tf 1.6.0 there was some complicated tf bug, so I switched to tf 1.13.1

After running for a night, I get the following (see below):
I don't know whether it's over-fitting or other issues.
I would be good if the authors can show (in the README), at what particular batch we can get a good oracle NLL.

Thanks!

batch: 87700 nll_oracle: 9.902623
batch: 87700 nll_test 7.6996207
mediator cooptrain iter#87700, balanced_nll 6.823340
mediator cooptrain iter#87710, balanced_nll 6.853920
mediator cooptrain iter#87720, balanced_nll 6.838597
mediator cooptrain iter#87730, balanced_nll 6.765410
mediator cooptrain iter#87740, balanced_nll 6.852599
mediator cooptrain iter#87750, balanced_nll 6.825665
mediator cooptrain iter#87760, balanced_nll 6.850584
mediator cooptrain iter#87770, balanced_nll 6.827829
mediator cooptrain iter#87780, balanced_nll 6.859410
mediator cooptrain iter#87790, balanced_nll 6.784107
batch: 87800 nll_oracle: 9.896609
batch: 87800 nll_test 7.7063065
mediator cooptrain iter#87800, balanced_nll 6.833647
mediator cooptrain iter#87810, balanced_nll 6.837624
mediator cooptrain iter#87820, balanced_nll 6.833254
cooptrain epoch# 563 jsd 6.7449245
mediator cooptrain iter#87830, balanced_nll 6.858107
mediator cooptrain iter#87840, balanced_nll 6.871158
mediator cooptrain iter#87850, balanced_nll 6.824977
mediator cooptrain iter#87860, balanced_nll 6.804533
mediator cooptrain iter#87870, balanced_nll 6.796575

Why is hidden dimension so small like 64/32?

The paper claimed that the training of CoT is more stable than ordinary GAN, Seq/LeakGAN and MLE in some sense. But the recommended hidden dimension of CoT is 64 for M and 32 for G. This is even smaller than in LeakGAN. Doesn't the stability of training and cheap computational demand allow a larger architecture to be trained?

By the way, it would be very helpful if you can release sample sentences.

Regarding the relative self-BLEU scores you calculated

Isn't RSBLEU being closer to 1.0 much more important than being lower than 1.0, since lack of diversity (mode-collapse) and too much diversity (exposure bias) are equally bad? Aren't MaliGAN and RankGAN superior to CoT in Table 3, though the test loss is much worse?

In fact, I just realized that Texygen has two options: get_bleu_fast and get_bleu, the latter of which uses a whole test data as a reference rather than 500 sentences from it. I hope all the BLEU scores for WMT News published came from get_bleu. It was mentioned in the original BLEU paper by Papineni et. al that using different numbers of reference sentences produces different results. Also, Texygen made all the sentences in lower case, which I hope you did, too. I calculated the self-BLEU-2 of WMT test dataset and obtained 0.862. On the other hand, from the BLEU-2 of MLE in your survey paper and the self-BLEU-2 of MLE in your cot paper, I calculated your self-BLEU-2 of test dataset to be 0.875. This is strange, since the value should match exactly. What do you think is the cause of this discrepancy? If you're fine, could you tell me the self-BLEU-n of test dataset for other n?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.