irhum / r2plus1d-pytorch Goto Github PK

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

License: MIT License

Python 100.00%

r2plus1d-pytorch's Introduction

R2Plus1D-PyTorch

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Link to original: paper and code

NOTE: This repository has been archived, although forks and other work that extend on top of this remain welcome

Requirements

R2Plus1D-PyTorch has the following requirements

PyTorch 0.4 and dependencies
OpenCV (tested on 3.4.0.12)
tqdm (for progress bars)

About this repository

This repository consists of four python files:

module.py - Contains an implementation of the factored, R2Plus1D convolution the entire implementation is based around. It is designed to be a replacement for nn.Conv3D in the appropriate scenario
network.py - Uses module.py to build up the residual network described in the paper
dataset.py - Implements a PyTorch dataset, that can load videos with appropriate labels from a given directory.
trainer.py - A mildly modified version of the script from the PyTorch tutorials to train the model. Features saving and restoring capabilities.

Training on Kinetics-400/600

This repository does not include a crawler or downloader for the Kinetics-400/600 dataset, however, one can be found here. It is strongly recommended to downsample the videos prior to training (and not on the fly), using a tool such as ffmpeg. If using the crawler, this can be done by adding "-vf", "scale=172:128" to the ffmpeg command list in the download clip function.

Training in general

This repository is designed for the ResNet to be trained on any dataset of videos in general, using the VideoDataloader class from dataset.py . It expects the videos to be arranged in a directory -> [train/val] folders -> [class_label] folders (one for each class) -> videos (the files themselves).

Forks and fixes of this repo are highly welcome!

r2plus1d-pytorch's People

Contributors

Stargazers

Watchers

r2plus1d-pytorch's Issues

bug in my implementation.

hi, team.@yechanp @irhum, I am trying this project but this bug occurred when i'm trying to put my data into the network.
could you help me with this error?
thanks very much.

Traceback (most recent call last):
File "train_r2p1d_ucf.py", line 17, in
train_model(model, train_dataloader, val_dataloader, path=save_path)
File "/usr/R2plus1D_TSN_combine-master/R2plus1D_TSN_combine-master/trainer.py", line 139, in train_model
for inputs, labels in dataloaders[phase]:
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/root/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/R2plus1D_TSN_combine-master/R2plus1D_TSN_combine-master/dataset.py", line 48, in getitem
buffer = self.loadvideoframe(self.fnames[index])
File "/usr/R2plus1D_TSN_combine-master/R2plus1D_TSN_combine-master/dataset.py", line 60, in loadvideoframe
im_path_pattern = self.get_im_path_pattern(fname)
File "/usr/R2plus1D_TSN_combine-master/R2plus1D_TSN_combine-master/dataset.py", line 58, in get_im_path_pattern
return os.path.join(self.im_path_root, vid_name, 'img*.jpg')
File "/root/anaconda3/lib/python3.7/posixpath.py", line 80, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Training Command

Which python file do we call to train the dataset? I don't see a main function. It would be helpful to get a terminal command.

What is performance in comparison with original implementation?

Great implementation. Could you provide the reproduce result that can use to compare with original implementation in CAFFE2? Thanks

where is dataloader.py ?

Where is the dataloader.py file ?
How can i input the video data into the network ?

Does this project provide the pretrained model?

Please forgive me to new this issues. Is there providing the pretrained weights? Or we train it from scratch by ourselves? Could you mind share it to all of us？Becase somebody like me does not have so much gpus due to our limited fundings. Thanks for your generosity！

Have you trained the model for the UCF101 from scratch?

I just trained for 45 epochs and set LR 0.01 for UCF101, but I got the really terrible result. Firstly the training loss is stable around 4, but cannot get lower(because the dataset is small, it should be overfitted). And second, the test accuracy is around 1% even after 45 epochs. I followed your instruction to process the dataset. I don't know where is wrong. Could you please tell me your result if you have?