Giter Club home page Giter Club logo

Comments (6)

JingqingZ avatar JingqingZ commented on August 17, 2024
  1. pegasus model masked several tokens in input and let encoder to predict masked tokens as MLM loss. However, the paper stated that the pegasus_large model deleted the MLM loss part. Does that mean the final model takes masked inputs to the encoder and decoder leveraged shifted target text and encoder outputs to generate target text only using the next token prediction loss? If so, what is the difference between the pegasus pretrained model and BART model (BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension). The BART model also takes noised text input to their encoder and uses the next token prediction loss for decoder outputs.

Yes, in the final model, only GSG is applied and there is no MLM.

Document: ABCDE

  • BART: input: A_C_E, target: ABCDE
  • PEGASUS: input: A_C_E, target: BD
  1. What is the strategy for selecting target sentences as shown in Fig.1? Is that the Gap Sentence Generation procedure?

Fig 1 demonstrates both GSG and MLM. There is no specific sentence selection strategy for Fig 1 as the figure is only for demo purpose. Any strategy (e.g. random, lead or principal) would fit the procedure shown in Fig 1.

Hope this may answer your questions. Thanks!

from pegasus.

PingYu-iris avatar PingYu-iris commented on August 17, 2024

Screen Shot 2020-07-14 at 1 05 49 PM

  1. I do not understand " PEGASUS: input: A_C_E, target: BD".
    As you deleted MLM loss, if the input for the decoder is " ABCDE" then the target would be "ABCDE". Then what is your "target: BD"?

  2. In fig1, I don't understand how to select input text for encoder and decoder?
    In this example, your text is "Pegasus if mythical. It is pure white. It names the model."
    The input for the encoder is masked "Pegasus if mythical. It names the model."
    The input for the decoder is " It is pure white."
    The target for the decoder is "It is pure while. "

If I have a paragraph, how to select sentences for encoder input and decoder input?

from pegasus.

JingqingZ avatar JingqingZ commented on August 17, 2024
  1. Each letter actually represents one whole sentence. So the target "BD" has two sentences B and D, which are masked in the input.

  2. How to select sentences for encoder input and decoder input? The sentence selection strategies (from a document) are described in Section 3.1 GSG in the paper.

from pegasus.

PingYu-iris avatar PingYu-iris commented on August 17, 2024

I still don't understand. If our paragraph is "ABCDE ":

From BART paper, their encoder input is "A_B_E", the decoder input is "ABCD", the decoder output is "ABCDE".

Let's take this for example, what is your encoder input and decoder input and output?

from pegasus.

JingqingZ avatar JingqingZ commented on August 17, 2024

As mentioned above,

Document: ABCDE

  • BART: encoder input: A_C_E, decoder input/target: ABCDE (it seems BART masks tokens or span, not sentences)
  • PEGASUS: encoder input: A_C_E, decoder input/target: BD (we mask sentences not tokens)

Decoder input and decoder output (target) are mostly same, except shifted right (i.e. insert token <START> in decoder input).

from pegasus.

PingYu-iris avatar PingYu-iris commented on August 17, 2024

Thanks

from pegasus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.