Giter Club home page Giter Club logo

dl4mt-multi-src's People

Contributors

kyunghyuncho avatar orhanf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dl4mt-multi-src's Issues

Training doesn't start

I am training on default data set provided in multi-text mode. But the training doesn't seem to progress at all. On running, code prints the following and then just waits without proceeding further. On reading the training log, it shows recieved_first_batch as False. I suspect something is wrong with part of the code which supplies the batch.

Using cuDNN version 6021 on context None
Mapped name None to device cuda2: GeForce GTX 1080 Ti (0000:84:00.0)
INFO:main:Model options:
{'additional_excludes': OrderedDict([('es.fr_en', [])]),
'alpha_c': OrderedDict([('es.fr_en', 0.0)]),
'att_dim': 1200,
'attend_merge_act': 'tanh',
'attend_merge_op': 'mean',
'batch_sizes': OrderedDict([('es.fr_en', 80)]),
'bokeh_port': 3333,
'cgs': ['es.fr_en'],
'dec_embed_sizes': OrderedDict([('en', 620)]),
'dec_nhids': OrderedDict([('en', 1000)]),
'dec_rnn_type': 'gru_cond_mCG',
'decay_c': OrderedDict([('es.fr_en', 0.0)]),
'drop_input': OrderedDict([('es.fr_en', 0.0)]),
'dropout': 1.0,
'enc_embed_sizes': OrderedDict([('es', 620), ('fr', 620)]),
'enc_nhids': OrderedDict([('es', 1000), ('fr', 1000)]),
'exclude_encs': OrderedDict([('es', False), ('fr', False)]),
'finish_after': 2000000,
'finit_act': 'tanh',
'finit_code_dim': 500,
'finit_mid_dim': 600,
'hook_samples': 2,
'incremental_dump': True,
'init_merge_act': 'tanh',
'init_merge_op': 'mean',
'lctxproj_act': 'tanh',
'ldecoder_act': 'tanh',
'learning_rate': 0.0002,
'lencoder_act': 'tanh',
'load_accumulators': True,
'log_prob_bs': 10,
'log_prob_freq': 2000,
'log_prob_sets': OrderedDict([('es.fr_en', {'fr': 'data/dev/newstest2011.fr.tok.bpe20k', 'en': 'data/dev/newstest2011.en.tok.bpe20k', 'es': 'data/dev/newstest2011.es.tok.bpe20k'})]),
'min_seq_lens': OrderedDict([('es.fr_en', 0)]),
'multi_latent': True,
'num_decs': 1,
'num_encs': 2,
'plot': False,
'readout_dim': 1000,
'reload': True,
'representation_act': 'linear',
'representation_dim': 1200,
'sampling_freq': 17,
'save_accumulators': True,
'save_freq': 5000,
'saveto': 'esfr2en_mSrc',
'schedule': OrderedDict([('es.fr_en', 1)]),
'seq_len': 50,
'sort_k_batches': 12,
'src_datas': OrderedDict([('es.fr_en', {'fr': 'data/europarl-v7.esfr-en.fr.tok.bpe20k', 'es': 'data/europarl-v7.esfr-en.es.tok.bpe20k'})]),
'src_eos_idxs': OrderedDict([('es', 0), ('fr', 0), ('es.fr', 0)]),
'src_vocab_sizes': OrderedDict([('es', 20624), ('fr', 20335)]),
'src_vocabs': OrderedDict([('es', 'data/europarl-v7.es-en.es.tok.bpe20k.vocab.pkl'), ('fr', 'data/europarl-v7.fr-en.fr.tok.bpe20k.vocab.pkl')]),
'step_clipping': 1,
'step_rule': 'uAdam',
'stream': 'multiCG_stream',
'take_last': True,
'trg_datas': OrderedDict([('es.fr_en', {'en': 'data/europarl-v7.esfr-en.en.tok.bpe20k'})]),
'trg_eos_idxs': OrderedDict([('en', 0)]),
'trg_vocab_sizes': OrderedDict([('en', 20212)]),
'trg_vocabs': OrderedDict([('en', 'data/europarl-v7.fr-en.en.tok.bpe20k.vocab.pkl')]),
'unk_id': 1,
'val_burn_in': 1,
'weight_noise_ff': False,
'weight_noise_rec': False,
'weight_scale': 0.01}
INFO:mcg.stream:Building training stream for cg:[es.fr_en]
INFO:mcg.stream: ... src:[es] - [data/europarl-v7.esfr-en.es.tok.bpe20k]
INFO:mcg.stream: ... src:[fr] - [data/europarl-v7.esfr-en.fr.tok.bpe20k]
INFO:mcg.stream: ... trg:[en] - [data/europarl-v7.esfr-en.en.tok.bpe20k]
INFO:mcg.stream:Building logprob stream for cg:[es.fr_en]
INFO:mcg.stream: ... src:[es] - [data/dev/newstest2011.es.tok.bpe20k]
INFO:mcg.stream: ... src:[fr] - [data/dev/newstest2011.fr.tok.bpe20k]
INFO:mcg.stream: ... trg:[en] - [data/dev/newstest2011.en.tok.bpe20k]
INFO:mcg.models: Encoder-Decoder: building training models
INFO:mcg.models: MultiEncoder: building training models
INFO:mcg.models: ... MultiSourceEncoder [es.fr] building training models
INFO:mcg.models: ... BidirectionalEncoder [es] building training models
INFO:mcg.models: ... ... using [gru] layer
INFO:mcg.models: ... BidirectionalEncoder [fr] building training models
INFO:mcg.models: ... ... using [gru] layer
INFO:mcg.models: MultiDecoder: building training models
INFO:mcg.models: ... using initializer merger [mean] for encoders: ['es', 'fr']
INFO:mcg.models: ... using post-context merger [mean] for encoders: ['es', 'fr']
INFO:mcg.models: ... ... using [gru_cond_mCG_mSrc] layer
INFO:mcg.models: Encoder-Decoder: building sampling models
INFO:mcg.models: MultiEncoder: building sampling models
INFO:mcg.models: ... MultiSourceEncoder [es.fr] building sampling models
INFO:mcg.models: ... BidirectionalEncoder [es] building sampling models
INFO:mcg.models: ... ... using [gru] layer
INFO:mcg.models: ... BidirectionalEncoder [fr] building sampling models
INFO:mcg.models: ... ... using [gru] layer
INFO:mcg.models: MultiDecoder: building sampling models
INFO:mcg.models: ... using initializer merger [mean] for encoders: ['es', 'fr']
INFO:mcg.models:Building f_init for CG[es.fr-en]...
INFO:mcg.models: ... using post-context merger [mean] for encoders: ['es', 'fr']
INFO:mcg.models: ... ... using [gru_cond_mCG_mSrc] layer
INFO:mcg.models:Building f_next for decoder[es.fr-en]..
INFO:mcg.models:Parameter shapes for computation graph[es.fr_en]
INFO:mcg.models: (1000,) : 9
INFO:mcg.models: (1000, 1000) : 6
INFO:mcg.models: (620, 1000) : 6
INFO:mcg.models: (2000,) : 5
INFO:mcg.models: (620, 2000) : 5
INFO:mcg.models: (1000, 2000) : 5
INFO:mcg.models: (1200, 1200) : 3
INFO:mcg.models: (1200,) : 3
INFO:mcg.models: (2000, 1200) : 2
INFO:mcg.models: (1200, 1000) : 2
INFO:mcg.models: (500, 600) : 1
INFO:mcg.models: (1200, 1) : 1
INFO:mcg.models: (20212,) : 1
INFO:mcg.models: (1000, 20212) : 1
INFO:mcg.models: (20624, 620) : 1
INFO:mcg.models: (20212, 620) : 1
INFO:mcg.models: (620, 1200) : 1
INFO:mcg.models: (20335, 620) : 1
INFO:mcg.models: (1200, 2000) : 1
INFO:mcg.models: (1200, 500) : 1
INFO:mcg.models: (1000, 1200) : 1
INFO:mcg.models: (600, 1000) : 1
INFO:mcg.models:Total number of parameters for computation graph[es.fr_en]: 58
INFO:mcg.models:Parameter names for computation graph[es.fr_en]:
INFO:mcg.models: (20212,) : ff_logit_en_b
INFO:mcg.models: (1000, 20212) : ff_logit_en_W
INFO:mcg.models: (1000,) : ff_logit_ctx_en_b
INFO:mcg.models: (1200, 1000) : ff_logit_ctx_en_W
INFO:mcg.models: (1200,) : ctx_embedder_fr_b
INFO:mcg.models: (2000, 1200) : ctx_embedder_fr_W
INFO:mcg.models: (1000, 1000) : encoder_r_fr_Ux
INFO:mcg.models: (1000, 2000) : encoder_r_fr_U
INFO:mcg.models: (20335, 620) : Wemb_fr
INFO:mcg.models: (1000,) : encoder_r_fr_bx
INFO:mcg.models: (620, 1000) : encoder_r_fr_Wx
INFO:mcg.models: (2000,) : encoder_r_fr_b
INFO:mcg.models: (620, 2000) : encoder_r_fr_W
INFO:mcg.models: (1000, 1000) : encoder_fr_Ux
INFO:mcg.models: (1000, 2000) : encoder_fr_U
INFO:mcg.models: (1000,) : encoder_fr_bx
INFO:mcg.models: (620, 1000) : encoder_fr_Wx
INFO:mcg.models: (2000,) : encoder_fr_b
INFO:mcg.models: (620, 2000) : encoder_fr_W
INFO:mcg.models: (1200,) : ctx_embedder_es_b
INFO:mcg.models: (2000, 1200) : ctx_embedder_es_W
INFO:mcg.models: (1000, 1000) : encoder_r_es_Ux
INFO:mcg.models: (1000, 2000) : encoder_r_es_U
INFO:mcg.models: (20624, 620) : Wemb_es
INFO:mcg.models: (1000,) : encoder_r_es_bx
INFO:mcg.models: (620, 1000) : encoder_r_es_Wx
INFO:mcg.models: (2000,) : encoder_r_es_b
INFO:mcg.models: (620, 2000) : encoder_r_es_W
INFO:mcg.models: (1000, 1000) : encoder_es_Ux
INFO:mcg.models: (1000, 2000) : encoder_es_U
INFO:mcg.models: (1000,) : encoder_es_bx
INFO:mcg.models: (620, 1000) : encoder_es_Wx
INFO:mcg.models: (2000,) : encoder_es_b
INFO:mcg.models: (620, 2000) : encoder_es_W
INFO:mcg.models: (1200,) : decoder_en_b_att
INFO:mcg.models: (1200, 1200) : decoder_en_Le_att
INFO:mcg.models: (1000, 1000) : decoder_en_Ux
INFO:mcg.models: (1200, 1000) : decoder_en_Wcx
INFO:mcg.models: (1200, 2000) : decoder_en_Wc
INFO:mcg.models: (1200, 1200) : decoder_en_Wp_att
INFO:mcg.models: (1200, 1) : decoder_en_U_att
INFO:mcg.models: (1200, 1200) : decoder_en_Ld_att
INFO:mcg.models: (1000, 1200) : decoder_en_Wd_dec
INFO:mcg.models: (1000, 2000) : decoder_en_U
INFO:mcg.models: (20212, 620) : Wemb_dec_en
INFO:mcg.models: (1000,) : ff_init_en_c
INFO:mcg.models: (600, 1000) : ff_init_en_U
INFO:mcg.models: (500, 600) : ff_init_en_U_shared
INFO:mcg.models: (1200, 500) : ff_init_en_W_shared
INFO:mcg.models: (620, 1200) : decoder_en_Wi_dec
INFO:mcg.models: (1000,) : decoder_en_bx
INFO:mcg.models: (620, 1000) : decoder_en_Wx
INFO:mcg.models: (2000,) : decoder_en_b
INFO:mcg.models: (620, 2000) : decoder_en_W
INFO:mcg.models: (1000,) : ff_logit_prev_en_b
INFO:mcg.models: (620, 1000) : ff_logit_prev_en_W
INFO:mcg.models: (1000,) : ff_logit_lstm_en_b
INFO:mcg.models: (1000, 1000) : ff_logit_lstm_en_W
INFO:mcg.models:Total number of parameters for computation graph[es.fr_en]: 58
INFO:mcg.models:Total number of excluded parameters for CG[es.fr_en]: [0]
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_en_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_en_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_ctx_en_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_ctx_en_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: ctx_embedder_fr_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ctx_embedder_fr_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_Ux
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: Wemb_fr
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_bx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_Wx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_fr_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_Ux
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_bx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_Wx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_fr_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: ctx_embedder_es_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ctx_embedder_es_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_Ux
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: Wemb_es
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_bx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_Wx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_r_es_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_Ux
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_bx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_Wx
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: encoder_es_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_b_att
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Le_att
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Ux
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wcx
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wc
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wp_att
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_U_att
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Ld_att
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wd_dec
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: Wemb_dec_en
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_init_en_c
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_init_en_U
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_init_en_U_shared
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_init_en_W_shared
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wi_dec
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_bx
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_Wx
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: decoder_en_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_prev_en_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_prev_en_W
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_lstm_en_b
INFO:mcg.models:Training parameter from CG[es.fr_en]: ff_logit_lstm_en_W
INFO:mcg.models:Total number of parameters will be trained for CG[es.fr_en]: [58]
INFO:mcg.algorithm:Initializing the training algorithm [es.fr_en]
INFO:mcg.algorithm:...computing gradient
INFO:mcg.algorithm:...clipping gradients
INFO:mcg.algorithm:...building optimizer
INFO:mcg.algorithm: took: 65.878868103 seconds
/home1/debajyoty/codes/dl4mt-multi-src.old/mcg/models.py:1123: UserWarning: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 6 is not part of the computational graph needed to compute the outputs: src_selector.
To make this warning into an error, you can pass the parameter on_unused_input='raise' to theano.function. To disable it completely, use on_unused_input='ignore'.
outputs=cost, on_unused_input='warn')
/home1/debajyoty/codes/dl4mt-multi-src.old/mcg/models.py:1123: UserWarning: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 7 is not part of the computational graph needed to compute the outputs: trg_selector.
To make this warning into an error, you can pass the parameter on_unused_input='raise' to theano.function. To disable it completely, use on_unused_input='ignore'.
outputs=cost, on_unused_input='warn')
INFO:mcg.algorithm:Entered the main loop


BEFORE FIRST EPOCH

Training status:
batch_interrupt_received: False
epoch_interrupt_received: False
epoch_started: True
epochs_done: 0
iterations_done: 0
received_first_batch: False
training_started: True
Log records from the iteration 0:
time_initialization: 1.69277191162e-05

Error [Errno 2] No such file or directory: 'esfr2en_single/params.npz'

I preprocessed the data and then I ran the model. It was running for day ,first epoch then it stopped next day.
Then I tried to decode but I getting Error
ERROR:mcg.models: Error [Errno 2] No such file or directory: 'esfr2en_single/params.npz'
INFO:translate:Output file: [translation.esfr2en.out.early]
INFO:translate:Translating from [fr]-[dl4mt-multi-src/data/dev/newstest2012.fr.tok.bpe20k]...
INFO:translate:Translating from [es]-[dl4mt-multi-src/data/dev/newstest2012.es.tok.bpe20k]...
INFO:translate:Using [8] processes...
Traceback (most recent call last):
File "dl4mt-multi-src/translate.py", line 390, in
beam_size=args.beam_size)
File "dl4mt-multi-src/translate.py", line 297, in main
src_vocabs_list, src_vocabs_sizes_list)
File "dl4mt-multi-src/translate.py", line 241, in _send_jobs
return idx+1
UnboundLocalError: local variable 'idx' referenced before assignment

config import error in mcg/algorithm.py

Hi there!

It seems, there is an error at 13th line of mcg/algorithm.py
It works if change
from blocks import config as cfg
to
from blocks.config import config as cfg

Regards,
Maria

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.