Comments (7)
Yes, I reduce the epoch size into just 5000 for a quick testing instead of waiting so long to get the final result and an error appears :), that's why it only took 3 seconds
And thanks for the answer, it works for me.
from muse.
Can you have a look at #31 and see if that helps? In particular, setting --dico_max_rank 1500
as a parameter. The idea is that the model tries to build a dictionary for the top 10k most frequent words, but if you have only 4k of them then it will raise an error.
Also, if you do not have any parallel dictionary you should indeed disable the word translation evaluation by simply commenting out this line:
https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L190
But you probably want to find a small dictionary before you start experiments, otherwise it will be very difficult for you to know whether your model is working or not.
from muse.
Thanks for the reply, it works but i think there is a minor bug in the code. Here I'm trying to print the whole "params" variable, then I got this :
Namespace(adversarial=True, batch_size=32, cuda=True, dico_build='S2T', dico_max_rank=10000, dico_max_size=10000, dico_method='nn', dico_min_size=0, dico_threshold=0, dis_clip_weights=0, dis_dropout=0.0, dis_hid_dim=2048, dis_input_dropout=0.1, dis_lambda=1, dis_layers=2, dis_most_frequent=50, dis_optimizer='sgd,lr=0.1', dis_smooth=0.1, dis_steps=5, emb_dim=30, epoch_size=4000, exp_name='debug', exp_path='/home/nghibui/codes/MUSE/dumped/debug/egzngeby1i', export='txt', lr_decay=0.98, lr_shrink=0.5, map_beta=0.001, map_id_init=True, map_optimizer='sgd,lr=0.1', max_vocab=200000, min_lr=1e-06, n_epochs=5, n_refinement=5, normalize_embeddings='', seed=-1, src_dico=<src.dictionary.Dictionary object at 0x7f8087dac588>, src_emb='data/cpp_vectors_30D_15_ae_train_trees.txt', src_lang='cpp', src_mean=None, tgt_dico=<src.dictionary.Dictionary object at 0x7f8087dac4e0>, tgt_emb='data/java_vectors_30D_15_ae_train_trees.txt', tgt_lang='java', tgt_mean=None, verbose=2)
10000
0 10000 128
128 10000 128
256 10000 128
384 10000 128
512 10000 128
I definitely put the setting --dico_max_rank 1500 into the command, but it's still using the 10000, I have to hard coded that line to get dico_max_rank = 1500 to make it works.
from muse.
It's also strange that in the Initial logger, the dico_max_rank=1500 already, but the number 10000 is still used:
INFO - 04/15/18 20:44:41 - 0:00:00 - adversarial: True
batch_size: 32
cuda: True
dico_build: S2T&T2S
dico_max_rank: 1500
dico_max_size: 100
dico_method: csls_knn_10
dico_min_size: 0
dico_threshold: 0
dis_clip_weights: 0
dis_dropout: 0.0
dis_hid_dim: 2048
dis_input_dropout: 0.1
dis_lambda: 1
dis_layers: 2
dis_most_frequent: 50
dis_optimizer: sgd,lr=0.1
dis_smooth: 0.1
dis_steps: 5
emb_dim: 30
epoch_size: 1000000
exp_name: debug
exp_path: /home/nghibui/codes/MUSE/dumped/debug/nucxafkaic
export: txt
lr_decay: 0.98
lr_shrink: 0.5
map_beta: 0.001
map_id_init: True
map_optimizer: sgd,lr=0.1
max_vocab: 200000
min_lr: 1e-06
n_epochs: 1
n_refinement: 5
normalize_embeddings:
seed: -1
src_emb: data/cpp_vectors_30D_15_ae_train_trees.txt
src_lang: cpp
tgt_emb: data/java_vectors_30D_15_ae_train_trees.txt
tgt_lang: java
verbose: 2
from muse.
Ok i think this line is the reason https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L163 .
Is this a bug or I am doing something wrong?
from muse.
Also, after finish the discriminator training, I get error in the refinement step:
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (nn method, S2T build, 10000 max size): 0.99295
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (csls_knn_10 method, S2T build, 10000 max size): 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target predictions: 0.31108 / 0.28520
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target / global accuracy: 0.00376 / 1.00000 / 0.30379
INFO - 04/15/18 20:57:41 - 0:00:03 - __log__:{"n_epoch": 0, "precision_at_1-nn": 0.0, "precision_at_5-nn": 0.0, "precision_at_10-nn": 0.0, "precision_at_1-csls_knn_10": 0.0, "precision_at_5-csls_knn_10": 0.0, "precision_at_10-csls_knn_10": 0.0, "mean_cosine-nn-S2T-10000": 0.9929460287094116, "mean_cosine-csls_knn_10-S2T-10000": 0.9929250478744507, "dis_accu": 0.30378963650425367, "dis_src_pred": 0.3110812323752522, "dis_tgt_pred": 0.2851960619994782}
INFO - 04/15/18 20:57:41 - 0:00:03 - * Best value for "mean_cosine-csls_knn_10-S2T-10000": 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - * Saving the mapping to /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - End of epoch 0.
INFO - 04/15/18 20:57:41 - 0:00:03 - Decreasing learning rate: 0.10000000 -> 0.09800000
INFO - 04/15/18 20:57:41 - 0:00:03 - ----> ITERATIVE PROCRUSTES REFINEMENT <----
INFO - 04/15/18 20:57:41 - 0:00:03 - * Reloading the best model from /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - Starting refinement iteration 0...
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
WARNING - 04/15/18 20:57:41 - 0:00:03 - Empty intersection ...
Traceback (most recent call last):
File "unsupervised.py", line 168, in <module>
trainer.procrustes()
File "/home/nghibui/codes/MUSE/src/trainer.py", line 174, in procrustes
A = self.src_emb.weight.data[self.dico[:, 0]]
TypeError: 'NoneType' object is not subscriptable
Any explanation for this? Thanks
from muse.
Yes you are right, the 10000 in the code is an issue we need to fix when the vocabulary size is smaller, sorry about that.
Regarding your error, the dico_build
method is set to dico_build: S2T&T2S
which means that the model will generate a source -> target dictionary and a target -> source dictionary, and take the intersection of both to get reliable translation pairs. In your case, the WARNING Empty intersection
means that the intersection was empty which means that the alignment totally failed. You can set dico_build
to --dico_build S2T
to avoid that, but usually when the intersection is empty it means that the alignment is bad.
Also, did you really do the discriminator training? Your error happens after 3 seconds of running which is not enough for the discriminator to run.
from muse.
Related Issues (20)
- why unsupervised can achieve Word alignment?
- Can some one give the dictionary tree of the whole project? Like in the data/crosslingual or monlingual/.. HOT 5
- non-parallel chinese traditional - english
- evaluate.py error
- openssl ssl_read ssl_error_syscall errno 110
- Reproducing Results in Table 1 HOT 1
- IndexError: index out of range in self
- AttributeError: 'Namespace' object has no attribute 'dico_max_rank'
- Assertion Error while using the unsupervised way.
- Tokenization issue in to-En bilingual dictionaries
- They hated the kid HOT 1
- Bad outcome in ja-en task HOT 1
- Rush Shhh INPUT aUTOMATION
- ValueError: too many values to unpack (expected 2) in unsupervised.py
- Will pytorch's deprecation of volatile affect the result?
- [ML Question] Is it possible somehow to translate two or three words ?
- Tried on GloVe?
- self-mapped english words in dictionaries
- ValueError: Function has keyword-only parameters or annotations, use inspect.signature() API which can support them HOT 3
- demo notebook references unavailable private files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from muse.