raivokoot / video-dataset-loading-pytorch Goto Github PK

Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.

License: BSD 2-Clause "Simplified" License

Python 100.00%

pytorch machine-learning dataloader deep-learning video-dataset action-recognition videos

video-dataset-loading-pytorch's Issues

END_FRAME

Thanks for your code, it's great. I wonder to create the annotation file if the END_FRAME of the video must be known in advance?

Assigning values from data loader in training loop causing error

First of all, I will apologize if this does not belong here. I am new to ML and Pytorch, so I still have a lot to learn. I am going to paste an image of my stack trace to give a feel for what the error is saying and then do some explanation of what I am seeing on my end.

For starters, this issue appears to only be on one case or very few cases. I say that because when I had shuffle=True in my dataloader, I would sometimes make it almost the whole way through the loop and crash towards the end. Other times it will crash the first few times through the loop. I have since turned shuffle to false to see if I could gain more information from where this was originating, instead of having my problem moving around on me. I would assume it is irrelevant to the problem, but it is on the 6th round through when shuffle is off.

After doing so, I entered debug mode and began to step into functions. After quite a few steps in and looking at values in the debug window, I don't see an issue so far. (There is definitely a possibility that I am missing something.)

If I step over, and not into (in the debugger) the line "for video_batch, labels in train_loader:" takes me out of my training function (where this line is contained) and then ends up catching at the line "if name == 'main': main()" that is in my main.py.

It likely shouldn't make a difference, but I am using the UCF11 dataset. My images, file directories, and text files are all formatted as the documentation states.

It looks like empty values are being returned, but as I was in debug mode, I saw which file directory it was looking into, and verified that the images that were supposed to be there were in that directory. They were.

If there is any other information you would like, please let me know and I'd be glad to post it.

reporting of different issues encountered while working

Hello.
I'd like to present something to fix:

TO FIX:

the quick demo(demo.py) at this link report an incorrect way to use the VideoFrameDataset class. In the specific:

image_template should be changed into imagefile_template
the parameter random_shift does not exist
the annotations.txt file indicated in the file ask the user to have the following structure per line:

but instead in the github is reported:

Presenting a discrepancy for a user who's trying to follow the instructions step by step.

Thanks for reading, have a nice day.

demo with transformations not working

AttributeError: 'list' object has no attribute 'size'

Why subtract 'frames_per_segment' to calculate 'segment_duration' ?

Hi. Why do you subtract 'frames_per_segment' from 'num_frames' and then divide by 'num_segments' to calculate 'segment_duration' ? Can we not directly divide 'num_frames' by 'num_segments' to get the 'segment_duration' ? Thanks!

Video-Dataset-Loading-Pytorch/video_dataset.py

Line 155 in 97b54d8

 segment_duration = (record.num_frames - self.frames_per_segment + 1) // self.num_segments 

Using a dataset with different widths and heights for each frame

Hello,

My dataset is pre-processed and takes frames of a video and crops out a detection from the video. This means my dataset has frames of slightly different sizes due to the bounding boxes being different. Due to this, I am getting an error when using the ImglistToTensor() function. The error being:

'RuntimeError: stack expects each tensor to be equal size, but got [3, 105, 111] at entry 0 and [3, 109, 115] at entry 1'

I try to initially resize them all before using this function but I get a type error:

'TypeError: img should be PIL Image. Got <class 'list'>'

I'm unsure if there is anything I can do to still make use of this custom dataset as it is just what I need for my project.

Extra whitespace characters in the annotation file breaks

The program breaks, when there are extra whitespace characters in the annotation file e.g.

bs 2        5       1
bs 6        8       2
bs 9        12      3
bs 13       16      4

The extra white space chars makes the annotation file more readable and editable with column selection mode. Therefore, I think, it will be nice to be ready for such an extra whitespace.
If You think it's an issue, I can handle it.

Support iterable / webdataset

What would be required to make this work for very large video sets? Would it be some integration with web dataset?

Problems with batch size

Hello,
first of all thank u for this nice video DataLoader.

I want to implement this into my project but currently I encounter into problems.
Like u described I preprocessed my video into frames and created the .txt file.
If u want now to load into my project I get a RuntimeError : RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [125, 5, 3, 224, 224]
The 125 is my batch size, 5 my num_segments, and so on.

I tried to reshape my video batch like this:
batch_size, frames, channels, height, width = video.shape
video = video.reshape(batch_size * frames, channels, height, width)
But then I get problems with my labels in the batch:
ValueError: Target size (torch.Size([64, 1])) must be the same as input size (torch.Size([320, 1]))

Do u know how to fix it or did I do sth wrong?

here's my code- part I am currently using:

def train_dataloader(self):
    preprocess = Compose([
        ImglistToTensor(),
        RandomResizedCrop(224, scale=(0.8, 1.0)),
        RandomHorizontalFlip(),
        RandomRotation(degrees=15),
        ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.1),
        Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    dataset = VideoFrameDataset(
        root_path=self.video_path_prefix,
        annotationfile_path=self.annotation_file_train,
        num_segments=5,
        frames_per_segment=1,
        imagefile_template='frame_{:04d}.jpg',
        transform=preprocess,
        test_mode=False
    )
    loader = torch.utils.data.DataLoader(
        dataset=dataset,
        batch_size=self.batch_size,
        shuffle=True,
        num_workers=self.num_worker,
        pin_memory=True
    )

Thanks in advance :)

bounding boxes

Hi there, this looks like a really good dataloader, but i was wondering why the bounding boxes werent loaded in the annotations, i have read through the read me briefly and it doesnt look like any bounding boxes are to be included within the annotations files. if this isnt the case please let me know so i can avoid doing extra work thank you.

demo fix error

in line 131 just replace frames with frame_tensor:
plot_video(rows=1, cols=5, frame_list= frame_tensor, plot_width=15., plot_height=3.)>

Question about annotations.txt

Hey,

great repo! thank you!
Just a question about annotations.txt. Am i right in the assumption that if I do NOT include a datapoint in annotations.txt, it will not be included in the dataset although it is in the same folder? Is this correct? That would be great for things like cross validation.

And by the way, the pytorch-lightning version that works with the repo in 2023 is 1.7.7(at least for me). Maybe include that in readme

All the best!

raivokoot / video-dataset-loading-pytorch Goto Github PK

video-dataset-loading-pytorch's Issues

END_FRAME

Assigning values from data loader in training loop causing error

reporting of different issues encountered while working

demo with transformations not working

Why subtract 'frames_per_segment' to calculate 'segment_duration' ?

Using a dataset with different widths and heights for each frame

Extra whitespace characters in the annotation file breaks

Support iterable / webdataset

Problems with batch size

bounding boxes

demo fix error

Question about annotations.txt

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent