Giter Club home page Giter Club logo

dl4mt-c2c's People

Contributors

jaseleephd avatar kyunghyuncho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dl4mt-c2c's Issues

Using char-to-char model for Hindi-English translation task and working on Google Colab

Hey @jasonleeinf! Thank you so much for sharing this repository. I'm a student trying to experiment with the code a little.
And I want to use this repo for the Hindi-English translation task. I have been able to use build_vocabulary_char.py so far and now I want to train the char-to-char model using Google Colab as I don't have a machine with good computing capacity or a GPU.
Can you please guide me a little on what shall be the steps further?

What must be the files I need to upload to Colab and what should be the changes that need to be done?
I shall be really grateful if you could help.

Per-instance weighted loss

Hi,

could you please help me: I'd like to add weighting for each seq2seq pair (aka weighted loss). How can I do that?

I realize that 'cost' function should be multiplied by the weight vector from 'prepare_data' before 'mean'. But failed to accomplish it in this code.

Thanks in advance!

Preprocessing for other datasets

Much thanks for sharing your code.
Now I want to use the model for other datasets, such as chinese-english. I know I need to perform the preprocess.sh script. I'm using yours WMT'15 preprocessed corpora at the moment, but I don't know how to process the datasets ( including training datas and dictionaries ) into the following files. In particular, I don't know how the three dictionaries are generated separately.

/home/miki/dl4mt-c2c-master/deen/models/de_en/ bi-bpe2char
/home/miki/dl4mt-c2c-master/deen/train/all_de-en.de.tok.bpe.shuf
/home/miki/dl4mt-c2c-master/deen/train/all_de-en.en.tok.shuf
/home/miki/dl4mt-c2c-master/deen/dev/newstest2013.de.tok.bpe
/home/miki/dl4mt-c2c-master/deen/dev/newstest2013.en.tok
/home/miki/dl4mt-c2c-master/deen/train/all_de-en.de.tok.bpe.word.pkl
/home/miki/dl4mt-c2c-master/deen/train/all_de-en.en.tok.300.pkl

Another, what is the form of pure character input? The file all_de-en.de.tok.shuf is the input, of char2char, why not enter pure character training datas?
Thanks very much!

Some Errors

Hi Jason,
Thanks for your paper and sharing your code. Encountered the following errors:


Traceback (most recent call last):
File "/home/prg/dl4mt-c2c-master/char2char/train_multi_char2char.py", line 173, in
main(0, args)
File "/home/prg/dl4mt-c2c-master/char2char/train_multi_char2char.py", line 94, in main
Model path: /prg/dl4mt-c2c-master/DataSet/models/many_en/many_en/
prepare_data=prepare_data,
File "/home/prg/dl4mt-c2c-master/char2char/nmt_many.py", line 135, in train
with open(dd, 'rb') as f:
IOError: [Errno 2] No such file or directory: '/prg/dl4mt-c2c-master/DataSet/multi-wmt15/dic/source.404.pkl'


but file source.404.pkl is exist!

Thanks for your help.

Request on Theano and pygpu version

Hi, I would like to know which version of Theano you used for this project. I tried Theano 0.9.0, 0.8.2, however all of them report errors running on GPU, the error reports are:

Theano 0.9.0 + pygpu 0.6.9:
AttributeError: ('The following error happened while compiling the node', DnnVersion(), '\n', "'module' object has no attribute '_get_ndarray_c_version'")
Theano 0.8.2 + pygpu 0.7.6:
RuntimeError: ('Wrong major API version for gpuarray:', 2, 'Make sure Theano and libgpuarray/pygpu are in sync.')

I'm using python 2.7.16 with cuda-9.1 + cudnn 7.0.5.

Could you please provide more details about the environment version?

Many thanks!
`

output?

Hi @jasonleeinf
Thanks for sharing the code. Could you please tell me which line of your code is the output of your encoder? Thanks for your time and help.

No module named mixer

I'm not a python expert but I assume it's having a problem importing from mixer.py in dl4mt-c2c/char2char/ ?

$ python translate/translate_char2char.py
Traceback (most recent call last):
  File "translate/translate_char2char.py", line 13, in <module>
     from mixer import *
 ImportError: No module named mixer

Preprocessing for other datasets

Much thanks for sharing your code.
If I want to use the model for other datasets, do I need to perform the punctuation-normalization and tokenization step as present in the preprocess.sh script.
For example, "translating" a char sequence to another char sequence(the char sequences may not be actual natural language sentences!)

Use your own parallel corpus for training and translation

Hi,
I am a beginner (machine translation) and I would like to ask how to use my own parallel corpus for training and translation. Training and translation of the specific orders and operations? Is there a manual?
Thank you very much, hoping to give answers!
Thanks again.

Using custom datasets

Hi,

thanks for your sharing the code! I've some question about the preprocess/build_dictionary_char.py script.

  • Can you provide some example program options? Currently the main method is never called.

  • The main method comes with three arguments, what exactly is expected for short_list and src?

I would really like to document all steps which are necessary to preprocess custom datasets (I would extend the documention later) :)

f_init gets stuck during translation

Hello,

During translation the system sometimes gets stuck. In particular (after printing some messages to figure out where the halt occurs) it seems that it gets stuck at the f_init call ( ret = f_init(x) ) in char2char/char_base.py line number 286.

However, sometimes, it just starts... Any idea why and what is the solution.

Thanks,
Dimitar

error when run on cpu

i have changed the source to run on cpu: use conv2d and pool_2d
but when i run training model, i have a error, please help me:

Traceback (most recent call last):
File "train_bi_char2char.py", line 175, in
main(0, args)
File "train_bi_char2char.py", line 84, in main
prepare_data=prepare_data,
File "/home/dungdx/workspace2/abstract_sentence_convolution/char2char/nmt.py", line 390, in train
cost, not_finite, clipped = f_grad_shared(x, x_mask, y, y_mask)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in call
outputs = self.fn()
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 951, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 940, in
self, node)
File "theano/scan_module/scan_perform.pyx", line 220, in theano.scan_module.scan_perform.perform (/home/dungdx/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:2442)
ValueError: ('Sequence is shorter then the required number of steps : (n_steps, seq, seq.shape):', 92, array([[ 1., 1., 1., ..., 1., 1., 1.],
[ 1., 1., 1., ..., 1., 1., 1.],
[ 1., 1., 1., ..., 1., 1., 1.],
...,
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32), (90, 33))
Apply node that caused the error: for{cpu,encoder__layers}(Subtensor{int64}.0, Subtensor{:int64:}.0, Subtensor{:int64:}.0, Subtensor{:int64:}.0, IncSubtensor{Set;:int64:}.0, encoder_U, encoder_Ux)
Toposort index: 805
Inputs types: [TensorType(int64, scalar), TensorType(float32, matrix), TensorType(float32, 3D), TensorType(float32, 3D), TensorType(float32, 3D), TensorType(float32, matrix), TensorType(float32, matrix)]
Inputs shapes: [(), (90, 33), (92, 33, 1024), (92, 33, 512), (93, 33, 512), (512, 1024), (512, 512)]
Inputs strides: [(), (132, 4), (135168, 4096, 4), (67584, 2048, 4), (67584, 2048, 4), (4096, 4), (2048, 4)]
Inputs values: [array(92), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[Subtensor{int64::}(for{cpu,encoder__layers}.0, Constant{1}), Elemwise{second,no_inplace}(for{cpu,encoder__layers}.0, DimShuffle{x,x,x}.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "train_bi_char2char.py", line 175, in
main(0, args)
File "train_bi_char2char.py", line 84, in main
prepare_data=prepare_data,
File "/home/dungdx/workspace2/abstract_sentence_convolution/char2char/nmt.py", line 235, in train
build_model(tparams, model_options)
File "/home/dungdx/workspace2/abstract_sentence_convolution/char2char/char_base.py", line 120, in build_model
proj = get_layer('gru')[1](tparams, hw_out, options, prefix='encoder', mask=x_mask)
File "/home/dungdx/workspace2/abstract_sentence_convolution/char2char/mixer.py", line 614, in gru_layer
strict=True)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Error during training

Thank you for sharing the code.. during training I got the following error.. do you know the reason or how to fix it?

Epoch 0 Update 5000 Cost 118.37625885 NaN_in_grad 0 NaN_in_cost 0 Gradient_clipped 5000 UD 1436.17706203 33.74 sentence/s

Source 0 : I a m h a p p y I m a d e t h i s d e c i s i o n .
Truth 0 : Traceback (most recent call last):
File "/lium/buster1/aransa/workspace/dl4mt-c2c/bpe2char/train_bi_bpe2char.py", line 145, in
main(0, args)
File "/lium/buster1/aransa/workspace/dl4mt-c2c/bpe2char/train_bi_bpe2char.py", line 82, in main
gen_sample=gen_sample,
File "/lium/buster1/aransa/workspace/dl4mt-c2c/bpe2char/nmt.py", line 459, in train
print "".join(truth_)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 24: ordinal not in range(128)

lower BLEU

Thanks for your great work! I have downloaded the trained model and evaluated on csen model. However I got 22.70 and 23.58 on 'dev' and 'test1' respectively which is lower than the paper (about 0.6). I have no idea what causes the issue, do I need to train the model further?

What is wmts in the code?

in train_bi_char2char.py:

def main(job_id, args):
save_file_name = args.model_name
source_dataset = args.data_path + wmts[args.translate]['train'][0][0]
target_dataset = args.data_path + wmts[args.translate]['train'][0][1]
valid_source_dataset = args.data_path + wmts[args.translate]['dev'][0][0]
valid_target_dataset = args.data_path + wmts[args.translate]['dev'][0][1]
source_dictionary = args.data_path + wmts[args.translate]['dic'][0][0]
target_dictionary = args.data_path + wmts[args.translate]['dic'][0][1]

what is wmts? where is it created?

One question about highway

Hello, I have read your paper. The segment embedding is fed into the highways in your paper. And I found
the definition of hwlayer in char2char/mixer.py. However, I do not find where the function is called? Can you tell me this? Thakn you

Strange decording output using the pre-trained model for multi-char2char

I used the pre-trained model you "Pre-trained models (6.0GB)", and used the model of multi-char2char with the original data and decoded the dev set .. and the output file I got just contains weird characters like the following. I did not try other models yet. Do you know what is wrong?

<96>║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║
<96>║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║║
<96>

Why save the best model only when saveFreq != validFreq ?

Hi~ Thank you for your great work!
I have a question regarding line 613 in char2char/nmt.py.
It seems that your code saves the best model when saveFreq != validFreq, but
by default saveFreq and validFreq are both set to 5000, which means that by default
the best model is not saved. Is there any reason why you don't save the best model when they are
the same?

Training-Question

Hi @jasonleeinf
Thanks for your code and congratulations to your paper. I got an error during training!

/home/dl4mt-c2c-master/char2char/nmt.py(407)train()

could you please help me?

Decode on CPU

I was trying to pull down a trained model and test it out on a machine with no GPU (to avoid all the extra cost of cloud GPU machines) but I got an error:

Traceback (most recent call last):
File "translate/translate_char2char.py", line 293, in
interactive=args.interactive,
File "translate/translate_char2char.py", line 185, in main
init_translation_model(model, options, init_params, build_sampler)
File "translate/translate_char2char.py", line 40, in init_translation_model
f_init, f_next = build_sampler(tparams, options, trng, use_noise)
File "/home/jrthom18/data/char_model/dl4mt-c2c/char2char/char_base.py", line 187, in build_sampler
conv_out = get_layer('multi_scale_conv_encoder')[1](tparams, emb, options, prefix='multi_scale_conv_enc1', width=options['conv_width'], nkernels=options['conv_nkernels'], pool_window=options['pool_window'], pool_stride=options['pool_stride'])
File "/home/jrthom18/data/char_model/dl4mt-c2c/char2char/mixer.py", line 487, in multi_scale_conv_encoder
output.append(dnn_conv(data, W[idx], border_mode='half', precision='float32'))
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/dnn.py", line 1160, in dnn_conv
return GpuDnnConv(algo=algo)(img, kerns, out, desc)
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/dnn.py", line 328, in init
if version() < (3000, 3000):
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/init.py", line 407, in dnn_version
dnn_available.msg)
Exception: ("We can't determine the cudnn version as it is not available", 'CUDA not available')

This was after I had tried setting the Theano device flag to cpu. Is there a quick way around this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.