Process followed for generating the i3d features about rtfm HOT 7 CLOSED

tianyu0207 commented on July 17, 2024

Process followed for generating the i3d features

from rtfm.

Comments (7)

tianyu0207 commented on July 17, 2024

Can you kindly explain the process you followed for generating the i3d features of the shanghai tech dataset so that we can follow the same for other datasets and videos as well?

video frames from non-overlapping sliding windows (16 frames each) are passed through the I3D network; features
are extracted from the ‘Mix 5c’ network layer, that are then reshaped to 2048-D vectors

from rtfm.

GowthamGottimukkala commented on July 17, 2024

First of all, I want to thank you for your work.
If I'm not wrong, the output from the 'Mix 5c' layer of the I3D network is generating 1024-D vectors for every n frames(=16).

How do I reshape that to 2048D? (Add features generated using RGB and flow -> 1024+1024?)
Also, the dimension of your .npy files for the Shanghai Tech dataset is (some k,10,2048). What does each dimension indicate here? The paper indicates that the proposed RTFM receives T*D feature matrix (2 dimension) for one video, so I didn't understand why the uploaded features were in 3 dimension
Are the features generated for the shanghai tech dataset are only using RGB frames and not optical flow images in the given onedrive link?

Kindly explain these so that more of us can implement this. Thanks in advance

from rtfm.

tianyu0207 commented on July 17, 2024

First of all, I want to thank you for your work.
If I'm not wrong, the output from the 'Mix 5c' layer of the I3D network is generating 1024-D vectors for every n frames(=16).

How do I reshape that to 2048D? (Add features generated using RGB and flow -> 1024+1024?)

Also, the dimension of your .npy files for the Shanghai Tech dataset is (some k,10,2048). What does each dimension indicate here? The paper indicates that the proposed RTFM receives T*D feature matrix (2 dimension) for one video, so I didn't understand why the uploaded features were in 3 dimension

Are the features generated for the shanghai tech dataset are only using RGB frames and not optical flow images in the given onedrive link?

Kindly explain these so that more of us can implement this. Thanks in advance

Hi Please use the I3d network with Resnet 50 as the backbone to extract features.
To be consistent with the previous works, we use 10-crop augmentation, hence, 10 represents each cropped frame and k represents the number of 16-frames clips.
The generated feature only uses the RGB features.

from rtfm.

GowthamGottimukkala commented on July 17, 2024

Thank you very much for clarifying this. It really helps.
In another issue in this repo, I found that we need to divide the video to 32 snippets that means for any given video I'll get 32*2048 features, so won't k be fixed as 32 and not variable as you mentioned in the second point? Sorry if I didn't understand it correctly.

from rtfm.

tianyu0207 commented on July 17, 2024

Thank you very much for clarifying this. It really helps.
In another issue in this repo, I found that we need to divide the video to 32 snippets that means for any given video I'll get 32*2048 features, so won't k be fixed as 32 and not variable as you mentioned in the second point? Sorry if I didn't understand it correctly.

Hi, the feature is first extracted based on every 16-frames using I3D. Therefore, k = total-frames/16. Then during training, we process each video into 32 segments using process_feat function in util.py. This is the same as the paper 'real-world anomaly detection in surveillance videos'.

from rtfm.

DungVo1507 commented on July 17, 2024

Hi @GowthamGottimukkala
I am also looking for a way to extract i3d features on my dataset. I don't know if you have succeeded in extracting i3d features or not? If successful can you guide me?
Thank you so much!

from rtfm.

ro1406 commented on July 17, 2024

Hey @GowthamGottimukkala @DungVo1507, did any of you find a way to successfully extract the features for your own datasets? If so, could you help guide me to do the same or point me to any helpful links that will help me do the same easily?

from rtfm.

Process followed for generating the i3d features about rtfm HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent