Giter Club home page Giter Club logo

Comments (11)

ricardorei avatar ricardorei commented on May 9, 2024

Hi @i55code most COMET models (the only exception is our ranking model) are trained with z-scores from WMT shared tasks. Z-scores are unbounded thus COMET predictions can go over 1 and below 0.

Having negative scores it's actually more common than having scores above 1 but sometimes this happens if a translation is really good.

from comet.

i55code avatar i55code commented on May 9, 2024

Thank you so much for explaining! This helps a lot:) @ricardorei

I also hope to ask two more questions:

  1. Is it possible that the data does not have "src" field? Does using different "src" have a big impact in COMET? BLEU does not require "src".
  2. The score format is [seg_scores, sys_score], when we are talking about COMET score, it is the sys_score, right? What does seq_scores do? In what cases, would knowing seq_scores be helpful?

Thanks!

from comet.

ricardorei avatar ricardorei commented on May 9, 2024
  1. COMET compares the MT with both source and reference in a shared embeddings space. If the source is completely different ideally this will drop your score. COMET metrics are really designed with the intent of using the source and some metrics (the reference-free metrics) only require the source! We don't have any model that uses only the reference. Are you trying to use COMET in a particular use-case where you don't have access to the source?

  2. the sys_score is the average of the seg_scores. The seg_scores are basically the quality assessments for each individual hypothesis (MT). They might be useful for some applications where we are actually interested in comparing two (or more) systems at the segment level.

from comet.

i55code avatar i55code commented on May 9, 2024

Thank you, @ricardorei, this helps a lot. I am trying to evaluate in a case when the source is not available. In my work, I am interested to know how well a translation is given some reference text. If there is future development on source-free metrics, feel free to let me know.

One more question I have is on the language list that are covered, are these languages refer to the source language? If, say, a language is not covered, how much can people trust the output? I read the "uncertainty-aware" paper, is there a simple way to print out the bounds from COMET's repo?

from comet.

ricardorei avatar ricardorei commented on May 9, 2024

The language list is basically source and target.

Let's imagine you are evaluating translations from Maori into English (or vice-versa). Maori is not covered in XLMR and the results might be unreliable. This is especially important if the target language is the uncovered language and/or if using a reference-free metric.

Regarding the question "how much people can trust the output?" this is very hard to answer. In WMT20 we evaluated the model on Inuktitut and the results were good, yet Tom Kocmi from Microsoft tested several COMET models on several languages and he reported that it was not stable for languages not covered in XLM-R (#18).

I read the "uncertainty-aware" paper, is there a simple way to print out the bounds from COMET's repo?
The code used in that paper is based on a previous COMET codebase and you can find everything here: UA_COMET

In this current codebase, you can only replicate the Monte Carlo Dropout results using the --mc_dropout flag. This should give you reasonable bounds without having to run several models.

from comet.

i55code avatar i55code commented on May 9, 2024

Thank you so much, @ricardorei . I mainly work with low resource languages, and COMET sometimes will give surprising good result when there is low BLEU. So I would be interested in getting a good bounds. Thank you, I will try UA_COMET.

Thank you for your explanation, I really appreciate it and have a good weekend!

from comet.

ricardorei avatar ricardorei commented on May 9, 2024

Thanks, @i55code have a nice weekend!

from comet.

i55code avatar i55code commented on May 9, 2024

@ricardorei Hi Ricardo, I have one last question to ask. For --mc_dropout, I saw the output on command line, but is there a simple way to invoke it through Python interface. I see that there is model.predict(...), and model.mc_dropout(), what is a simple command in python to get bounds? Thanks!

from comet.

ricardorei avatar ricardorei commented on May 9, 2024

Its an argument of the predict function:
mean_scores, std_scores, sys_score = model.predict(data, batch_size=8, gpus=1, mc_dropout=30)

from comet.

i55code avatar i55code commented on May 9, 2024

Thank you, @ricardorei Ricardo! Really learn a lot today. I hope to use COMET more in my future research. Take good care and stay safe!

from comet.

ricardorei avatar ricardorei commented on May 9, 2024

Thanks, @i55code feel free to reach out for questions!

from comet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.