Comments (7)
@Enclavet First, the results of timesformer shown in this repo are pretrained on the K600, and for k400, it can achieve around 77%. How to get a similar performance largely depends on your hparams. Would you please show me your hparams loged before the training start?
from videotransformer-pytorch.
Attaching hparams:
Namespace(lr=0.005, epoch=15, gpus=-1, nccl_ifname='lan2', batch_size=8, num_workers=4, log_interval=30, save_ckpt_freq=20, num_class=400, num_samples_per_cls=10000, arch='timesformer', attention_type='divided_space_time', pretrain='vit', optim_type='sgd', lr_schedule='cosine', objective='supervised', resume=False, resume_from_checkpoint=None, num_frames=8, frame_interval=40, seed=0, train_data_path='/home/ec2-user/train_list.txt', val_data_path='/home/ec2-user/val_list.txt', test_data_path=None, root_dir='/home/ec2-user/workdir')
from videotransformer-pytorch.
@Enclavet The hparams are almostly same with my experiment settings except that i set the epoch to 30 for the consine lr schedule and 32 for frame interval. You can try the default settings to see the final result. By the way, why do you choose 40 for the frame interval? In my opnion, the 32 is enough to cover the entire video frames under a 25fps. So what i think about is how do you perform the data paperation for k400 and have you ever aligned the fps of each video sample?
from videotransformer-pytorch.
@mx-mark My data comes from this repo: https://github.com/cvdfoundation/kinetics-dataset.
Videos appear to be 10 seconds in length at around 25-30fps (not all the same). Are you doing any more data preparation beyond downloading the video + cutting the relevant section?
As mentioned 32 frame interval with 8 frames should cover most videos and I was using 40 as a test. I have done training with 32 as well and gotten similar performance. Actually the best val acc_top1 was ~75 after 15e, not ~73 as mentioned earlier.
Do you think more epochs will help? I notice that at some point acc does not improve with more epochs and can actually decrease.
from videotransformer-pytorch.
@Enclavet normally, we will resample the video fps to the same
from videotransformer-pytorch.
I aligned my dataset for 225 dimensions and 25fps and ran training on K400. I was able to achieve 76 >top1 acc.
Running it with K600 now.
from videotransformer-pytorch.
Was never able to achieve 78>top1 acc without modifying the num_frames and frame_intervals.
Was able to achieve 78>top1acc on K600 with num_frames = 12 and frame_interval set to 20.
This is with a dataset from https://github.com/cvdfoundation/kinetics-dataset resampled to 25fps and aligned to 225 dimensions.
Closing this as I am happy with the performance.
from videotransformer-pytorch.
Related Issues (20)
- How do we load ImageNet-21k ViT weights? HOT 3
- Vivit Training Problem HOT 1
- What is the final score of maskfeat? HOT 6
- torch version HOT 1
- How to load Tensorflow checkpoints? HOT 3
- Missing keys in demo notebook HOT 4
- How to dataloader? HOT 2
- structure of ViViT-b HOT 1
- Errors when loading pretrained weights -pretrain_pth 'vivit_model.pth' -weights_from 'kinetics' HOT 1
- Question about Loading a pretrained model(ViT)
- Log-File for ViViT finetuning with Imagenet pre-train Weights
- How to convert TimeSformer Implementation For Regression Tasks
- While training viti I am getting thsi error
- 代码写的真好
- regarding fine-tuning ViViT model on my dataset.
- How can ViViT be used to extract video features?
- How Can I Create a Video_Loader Function That Lets Me Use My Own Videos With ViViT?
- where is vit-b pretrained model on imagenet-21k?
- How to test my trained model?
- how to make datasets use my own videos
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from videotransformer-pytorch.