What do you do if you have a lot of videos? about clip4clip HOT 6 CLOSED

arrowluo commented on May 20, 2024

What do you do if you have a lot of videos?

from clip4clip.

Comments (6)

ArrowLuo commented on May 20, 2024

@lonngxiang It is not a clear question for me. Can you describe clearer?
I guess your problem is that if there is a lot of videos as retrieval candidates, how to deal with the memory cost. If that is the case, I think the loose type can solve this problem naturally with cached video features. Then the retrieval scores can be calculated via the off-the-shelf feature. Besides, some other hash methods can also be used to speed the retrieval.

from clip4clip.

lonngxiang commented on May 20, 2024

@lonngxiang It is not a clear question for me. Can you describe clearer?
I guess your problem is that if there is a lot of videos as retrieval candidates, how to deal with the memory cost. If that is the case, I think the loose type can solve this problem naturally with cached video features. Then the retrieval scores can be calculated via the off-the-shelf feature. Besides, some other hash methods can also be used to speed the retrieval.

I meaning： A video is divided into many frames of images. Each query is based on image dimension, so the result will be very complicated. I don't know if I can get the video vector directly, so I can directly recall the whole video content that conforms to Query

from clip4clip.

ArrowLuo commented on May 20, 2024

Yes, we aggregate all frame features via mean pooling, LSTM, Transformer, etc. in our paper. Thus, a video is indeed encoded as a vector. See section 3.3 for more information, please.

from clip4clip.

lonngxiang commented on May 20, 2024

Yes, we aggregate all frame features via mean pooling, LSTM, Transformer, etc. in our paper. Thus, a video is indeed encoded as a vector. See section 3.3 for more information, please.

OK, thank you for your patient reply

from clip4clip.

ArrowLuo commented on May 20, 2024

Thank you for your attention. Reopen this issue if any other questions.

from clip4clip.

lonngxiang commented on May 20, 2024

use mean pooling, LSTM, Transformer to generate a video vector；Does this paper end up as a similar recall of text vector dot videos vector？

from clip4clip.

Recommend Projects

What do you do if you have a lot of videos? about clip4clip HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent