Giter Club home page Giter Club logo

Comments (6)

zinengtang avatar zinengtang commented on September 16, 2024

Generative models are searching answers from a very large space compared to discriminative models that only search from context or a limited label pools. But there are two cases of QA.

In the case of extractive QA, i.e., answers are in the context, we can give generative model a prompt to tell that it is extractive. So, the model would know that it should obtain answer only from context and it can be more confident. This prompt needs to appear in pretraining of course.

But for QA with answers not in the context, we can only use generative models. It is exactly the nature of generative models to generate versatile outputs. For this part, it depends the tasks you are dealing with and the dataset trained on. What is the specific example you can give?

from i-code.

logan-markewich avatar logan-markewich commented on September 16, 2024

Yes, I should have clarified my interest was in extractive QA, like the DocVQA dataset. When asking questions to a model trained on DocVQA, it can be very helpful to use the confidence to filter the answer and only show an end user the most confident predictions. Of course, this only works when predicting simple start/end positions of answers.

The paper doesn't specify exactly, but I'm assuming you used UDOP as a generative model when training/evaluating on DocVQA?

from i-code.

zinengtang avatar zinengtang commented on September 16, 2024

Yes it is generative model.
Usually we use beam search or greedy search and obtain the best sequence. The confidence here is joint probability of the searched words. I am not sure why finding the confidence is an issue.

from i-code.

logan-markewich avatar logan-markewich commented on September 16, 2024

Finding the confidence is easy enough for sure! In the past, I've just found the confidence of the generated string is less helpful compared to the traditional start/end logits. Most recently I was using the Donut model after training on DocVQA. Often the confidences do not correlate with an answer being right or wrong (>95% confidence but the answer is wrong, that was a common finding). Instead, I had to rely on the model predicting an "answer not found" token, but even that was not as helpful.

Contrasted with traditional start/end position predictions, those confidences are an excellent indicator of a correct or wrong answer.

In any case, I am excited to see if this situation has improved with UDOP :)

from i-code.

zinengtang avatar zinengtang commented on September 16, 2024

Why in traditional start/end position predictions, confidences are excellent indicator? Do you have any experiments or stats to support this claim? For example, you can show that choosing the most confident answer is better than choosing from top5 in generative model. > 95 % does not mean anything absolutely but only relatively. >95% does not mean model is confident.

from i-code.

logan-markewich avatar logan-markewich commented on September 16, 2024

Yes, 95% is relative. But if you don't use softmax, you can get a better indication of absolute confidence. So, maybe I shouldn't be using percentages here haha, it's more of a signal strength then.

I've been doing a lot of work with extractive QA. Unfortunately, it's all NDA and whatnot 🙄, but broadly speaking, I am extracting specific fields from documents, and each field has multiple questions that might lead to a desired answer. However, there is no guarantee each field is in the document.

For each field, one can determine an appropriate threshold, after testing on a large enough test set, so that the extracted answer is most likely correct.

Usually, these confidence thresholds range from 20-60. But with my experiments with generative models like Donut, this process starts to fall apart. I wish I could share the data, but it seems others on the Donut github shared similar thoughts.

In any case, this is just my thoughts/experiences from a commercial perspective. I'm still very excited to take UDOP for a spin 💪🏻

from i-code.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.