Comments (7)
Can you kindly explain the process you followed for generating the i3d features of the shanghai tech dataset so that we can follow the same for other datasets and videos as well?
video frames from non-overlapping sliding windows (16 frames each) are passed through the I3D network; features
are extracted from the ‘Mix 5c’ network layer, that are then reshaped to 2048-D vectors
from rtfm.
First of all, I want to thank you for your work.
If I'm not wrong, the output from the 'Mix 5c' layer of the I3D network is generating 1024-D vectors for every n frames(=16).
- How do I reshape that to 2048D? (Add features generated using RGB and flow -> 1024+1024?)
- Also, the dimension of your .npy files for the Shanghai Tech dataset is (some k,10,2048). What does each dimension indicate here? The paper indicates that the proposed RTFM receives T*D feature matrix (2 dimension) for one video, so I didn't understand why the uploaded features were in 3 dimension
- Are the features generated for the shanghai tech dataset are only using RGB frames and not optical flow images in the given onedrive link?
Kindly explain these so that more of us can implement this. Thanks in advance
from rtfm.
First of all, I want to thank you for your work.
If I'm not wrong, the output from the 'Mix 5c' layer of the I3D network is generating 1024-D vectors for every n frames(=16).
- How do I reshape that to 2048D? (Add features generated using RGB and flow -> 1024+1024?)
- Also, the dimension of your .npy files for the Shanghai Tech dataset is (some k,10,2048). What does each dimension indicate here? The paper indicates that the proposed RTFM receives T*D feature matrix (2 dimension) for one video, so I didn't understand why the uploaded features were in 3 dimension
- Are the features generated for the shanghai tech dataset are only using RGB frames and not optical flow images in the given onedrive link?
Kindly explain these so that more of us can implement this. Thanks in advance
- Hi Please use the I3d network with Resnet 50 as the backbone to extract features.
- To be consistent with the previous works, we use 10-crop augmentation, hence, 10 represents each cropped frame and k represents the number of 16-frames clips.
- The generated feature only uses the RGB features.
from rtfm.
Thank you very much for clarifying this. It really helps.
In another issue in this repo, I found that we need to divide the video to 32 snippets that means for any given video I'll get 32*2048 features, so won't k be fixed as 32 and not variable as you mentioned in the second point? Sorry if I didn't understand it correctly.
from rtfm.
Thank you very much for clarifying this. It really helps.
In another issue in this repo, I found that we need to divide the video to 32 snippets that means for any given video I'll get 32*2048 features, so won't k be fixed as 32 and not variable as you mentioned in the second point? Sorry if I didn't understand it correctly.
Hi, the feature is first extracted based on every 16-frames using I3D. Therefore, k = total-frames/16. Then during training, we process each video into 32 segments using process_feat function in util.py. This is the same as the paper 'real-world anomaly detection in surveillance videos'.
from rtfm.
Hi @GowthamGottimukkala
I am also looking for a way to extract i3d features on my dataset. I don't know if you have succeeded in extracting i3d features or not? If successful can you guide me?
Thank you so much!
from rtfm.
Hey @GowthamGottimukkala @DungVo1507, did any of you find a way to successfully extract the features for your own datasets? If so, could you help guide me to do the same or point me to any helpful links that will help me do the same easily?
from rtfm.
Related Issues (20)
- feature extraction
- Results can't be reproduce for Sanghaitech
- Feature Extraction
- Generation of MatLab Labels
- AUC for UCF-Crime is Too low and Can't reproducible. HOT 1
- Can't download UCF_Crime test dataset
- Customized dataset HOT 3
- parameter setting for UCSDped2 dataset HOT 3
- some questions about extracting I3D features
- xd-violence
- How to implement inference code? HOT 1
- Where do I find dir test_frame_mask, or make it? HOT 1
- how to make ground truth file and list file for shanghai tech dataset
- UCSD-Ped2 实验设置
- Custom .mat files HOT 3
- Can not reproduce the provided I3D features extracted from UCF-Crime on RTFM Github
- The UCF-Crime checkpoints link is disabled HOT 2
- Only ShanghaiTech features and checkpoint for ShanghaiTech can be used in the link
- UCF-Crime train I3d features on Google drive 下载失败—已被禁止 HOT 1
- Feature extraction results did not meet expectations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rtfm.