Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

regarding attention concatentation in beam search about a-pytorch-tutorial-to-image-captioning HOT 1 CLOSED

sgrvinod commented on June 1, 2024

regarding attention concatentation in beam search

from a-pytorch-tutorial-to-image-captioning.

Comments (1)

sgrvinod commented on June 1, 2024

Hi! If I understand your question correctly, you are wondering if it makes sense that different candidate words arising from the same preceding sequence have the same attention map.

Yes, definitely. As you know, the attention map is not based on the candidate words. It is the other way around - the candidate words were generated as likely possibilities for that particular attention map. Therefore, by definition, this will always be the case.

For example, assume we're looking at an image of a man standing next to New York City taxi.

Assume that, so far, the model has generated a man stands next to a.

Based on this generated sequence, it decides to attend to the taxi in the image and creates the corresponding attention map. Then, based on this attention map, there can be more than one valid choice for the next word in the sequence:

a man stands next to a car
a man stands next to a cab
a man stands next to a taxi
a man stands next to a yellow (...car)
a man stands next to a new (...york taxi)

All of these choices are based on the same attention map.

from a-pytorch-tutorial-to-image-captioning.

regarding attention concatentation in beam search about a-pytorch-tutorial-to-image-captioning HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent