there are my setting below, and the rest of parameters are remained as default. th

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Yes, this is the bug I mentioned :) This is fixed: <a class="commit-link" data-hoverca

Results obtained are different from that published on the paper about muse HOT 9 CLOSED

facebookresearch commented on July 20, 2024

Results obtained are different from that published on the paper

from muse.

Comments (9)

glample commented on July 20, 2024

Can you try to add the option:
--normalize_embeddings center to your command python unsupervised.py
and see if this helps?

from muse.

yudianer commented on July 20, 2024

Thank you @glample , I am trying.

from muse.

yudianer commented on July 20, 2024

I have some question about the parameters:
--dis_dropout: Discriminator dropout.
For the parameter above, I just find something about --dis_input_dropout, namely,"As a result, we only feed the discriminator with the 50,000 most frequent words. " as for dis_input_dropout. But I did not find any clue about "--dis_dropout"

--dis_lambda：Discriminator loss feedback coefficient.
--dis_clip_weights: Clip discriminator weights (0 to disable)

I did not get any idea about these two parameters above via the paper.

--dico_max_rank:Maximum dictionary words rank (0 to disable)
--dico_min_size:Minimum generated dictionary size (0 to disable)
--dico_max_size:Maximum generated dictionary size (0 to disable)

In my opinion, as for the three parameters above, the dictionary generated is for validation, so dico_min_size and dico_max_size are used to specify the size of the dictionary. But what is dico_max_rank for?

Thank you @glample

from muse.

yudianer commented on July 20, 2024

it is not useful to specify --normalize_embeddings center, still got results like:

INFO - 04/25/18 11:06:52 - 0:50:04 - 1500 source words - csls_knn_10 - Precision at k = 1: 0.000000
INFO - 04/25/18 11:06:53 - 0:50:04 - 1500 source words - csls_knn_10 - Precision at k = 5: 0.000000
INFO - 04/25/18 11:06:53 - 0:50:04 - 1500 source words - csls_knn_10 - Precision at k = 10: 0.000000

from muse.

glample commented on July 20, 2024

Mmm I think the problem is that the default epoch size we set in the code is too big. I forgot what we used in the paper exactly, but I remember we used a quite small epoch size (and consequently more epochs). This way, the evaluations are more frequent, and you are more likely to get a good model. I just tried the command:

CUDA_VISIBLE_DEVICES=0 python unsupervised.py --src_lang en --tgt_lang zh --src_emb data/wiki.en.vec --tgt_emb data/wiki.zh.vec --n_refinement 10 --n_epochs 10 --epoch_size 250000 --normalize_embeddings center

and after half an hour I get ~33% accuracy P@1. I guess using an even smaller epoch size like 100000 might be even better. I just noticed that there is a small bug when saving the embeddings and using center for normalize_embeddings, I'll fix this tomorrow, sorry about that.

Regarding --dico_max_rank, --dico_min_size and --dico_max_size, they correspond to parameters on the synthetic dictionaries we build during the refinement steps. In particular:

--dico_max_rank 15000  # means we will never consider pairs of words where you have a source word or a target word which is not in the top 15000 most frequent words
--dico_max_size 10000  # means we will never consider more than 10000 pairs in total
--dico_min_size 1000   # means that we will always consider at least 1000 pairs (used in combination with dico_threshold that removes translations for which we do not have a high confidence)

You can check all this in dico_builder.py.

Regarding --dis_dropout , this is just the dropout between the discriminator layers.

from muse.

yudianer commented on July 20, 2024

Thank you @glample , it becomes normal when training with the latest code(4/23/2018) downloaded.

INFO - 04/25/18 23:42:52 - 3:13:13 - 1500 source words - csls_knn_10 - Precision at k = 1: 33.733333
INFO - 04/25/18 23:42:52 - 3:13:13 - 1500 source words - csls_knn_10 - Precision at k = 5: 53.533333
INFO - 04/25/18 23:42:52 - 3:13:13 - 1500 source words - csls_knn_10 - Precision at k = 10: 59.533333

But there always comes an error when doing mapping with the best model at the last step:

Traceback (most recent call last):
  File "unsupervised.py", line 184, in <module>
    trainer.export()
  File "/home/jack/software/MUSE/src/trainer.py", line 255, in export
    normalize_embeddings(src_emb, params.normalize_embeddings, mean=params.src_mean)
  File "/home/jack/software/MUSE/src/utils.py", line 419, in normalize_embeddings
    emb.sub_(mean)
RuntimeError: inconsistent tensor size, expected r_ [332647 x 300], t [332647 x 300] and src [200000 x 300] to have the same number of elements, but got 99794100, 99794100 and 60000000 elements respectively at /pytorch/torch/lib/TH/generic/THTensorMath.c:1008

this is the bug you mentioned above?
Thank you!

from muse.

glample commented on July 20, 2024

Yes, this is the bug I mentioned :) This is fixed: a620cc8

from muse.

glample commented on July 20, 2024

Closing for now, feel free to reopen if you still face this issue.

from muse.

tegillis commented on July 20, 2024

I'm unable to reproduce the results above, even with the changes to epoch size and number of epochs.

I am running: python unsupervised.py --src_lang en --tgt_lang zh --src_emb wiki.en.vec --tgt_emb wiki.zh.vec --n_refinement 10 --n_epochs 10 --epoch_size 250000 --normalize_embeddings center and after 10 epochs (before refinement) only getting the following results:

INFO - 03/25/19 21:23:48 - 0:27:10 - 2230 source words - nn - Precision at k = 1: 1.165919
INFO - 03/25/19 21:23:48 - 0:27:10 - 2230 source words - nn - Precision at k = 5: 2.780269
INFO - 03/25/19 21:23:48 - 0:27:10 - 2230 source words - nn - Precision at k = 10: 3.587444
INFO - 03/25/19 21:23:48 - 0:27:10 - Found 2230 pairs of words in the dictionary (1500 unique). 0 other pairs contained at least one unknown word (0 in lang1, 0 in lang2)
INFO - 03/25/19 21:24:10 - 0:27:31 - 2230 source words - csls_knn_10 - Precision at k = 1: 1.121076
INFO - 03/25/19 21:24:10 - 0:27:31 - 2230 source words - csls_knn_10 - Precision at k = 5: 3.228700
INFO - 03/25/19 21:24:10 - 0:27:31 - 2230 source words - csls_knn_10 - Precision at k = 10: 4.843049

What's also interesting is the mean cosine validation metric seems to decrease as the precision improves. That last epoch had the following value:
Mean cosine (csls_knn_10 method, S2T build, 10000 max size): 0.38720

while the first epoch (with worse precision) had:
Mean cosine (csls_knn_10 method, S2T build, 10000 max size): 0.62325

Any idea what's going on here?

from muse.

Results obtained are different from that published on the paper about muse HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent