raivokoot / video-dataset-loading-pytorch Goto Github PK
View Code? Open in Web Editor NEWGeneric PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
License: BSD 2-Clause "Simplified" License
Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
License: BSD 2-Clause "Simplified" License
Thanks for your code, it's great. I wonder to create the annotation file if the END_FRAME of the video must be known in advance?
First of all, I will apologize if this does not belong here. I am new to ML and Pytorch, so I still have a lot to learn. I am going to paste an image of my stack trace to give a feel for what the error is saying and then do some explanation of what I am seeing on my end.
For starters, this issue appears to only be on one case or very few cases. I say that because when I had shuffle=True in my dataloader, I would sometimes make it almost the whole way through the loop and crash towards the end. Other times it will crash the first few times through the loop. I have since turned shuffle to false to see if I could gain more information from where this was originating, instead of having my problem moving around on me. I would assume it is irrelevant to the problem, but it is on the 6th round through when shuffle is off.
After doing so, I entered debug mode and began to step into functions. After quite a few steps in and looking at values in the debug window, I don't see an issue so far. (There is definitely a possibility that I am missing something.)
If I step over, and not into (in the debugger) the line "for video_batch, labels in train_loader:" takes me out of my training function (where this line is contained) and then ends up catching at the line "if name == 'main': main()" that is in my main.py.
It likely shouldn't make a difference, but I am using the UCF11 dataset. My images, file directories, and text files are all formatted as the documentation states.
It looks like empty values are being returned, but as I was in debug mode, I saw which file directory it was looking into, and verified that the images that were supposed to be there were in that directory. They were.
If there is any other information you would like, please let me know and I'd be glad to post it.
Hello.
I'd like to present something to fix:
TO FIX:
Thanks for reading, have a nice day.
AttributeError: 'list' object has no attribute 'size'
Hi. Why do you subtract 'frames_per_segment' from 'num_frames' and then divide by 'num_segments' to calculate 'segment_duration' ? Can we not directly divide 'num_frames' by 'num_segments' to get the 'segment_duration' ? Thanks!
Hello,
My dataset is pre-processed and takes frames of a video and crops out a detection from the video. This means my dataset has frames of slightly different sizes due to the bounding boxes being different. Due to this, I am getting an error when using the ImglistToTensor() function. The error being:
'RuntimeError: stack expects each tensor to be equal size, but got [3, 105, 111] at entry 0 and [3, 109, 115] at entry 1'
I try to initially resize them all before using this function but I get a type error:
'TypeError: img should be PIL Image. Got <class 'list'>'
I'm unsure if there is anything I can do to still make use of this custom dataset as it is just what I need for my project.
The program breaks, when there are extra whitespace characters in the annotation file e.g.
bs 2 5 1
bs 6 8 2
bs 9 12 3
bs 13 16 4
The extra white space chars makes the annotation file more readable and editable with column selection mode. Therefore, I think, it will be nice to be ready for such an extra whitespace.
If You think it's an issue, I can handle it.
What would be required to make this work for very large video sets? Would it be some integration with web dataset?
Hello,
first of all thank u for this nice video DataLoader.
I want to implement this into my project but currently I encounter into problems.
Like u described I preprocessed my video into frames and created the .txt file.
If u want now to load into my project I get a RuntimeError : RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [125, 5, 3, 224, 224]
The 125 is my batch size, 5 my num_segments, and so on.
I tried to reshape my video batch like this:
batch_size, frames, channels, height, width = video.shape
video = video.reshape(batch_size * frames, channels, height, width)
But then I get problems with my labels in the batch:
ValueError: Target size (torch.Size([64, 1])) must be the same as input size (torch.Size([320, 1]))
Do u know how to fix it or did I do sth wrong?
here's my code- part I am currently using:
def train_dataloader(self):
preprocess = Compose([
ImglistToTensor(),
RandomResizedCrop(224, scale=(0.8, 1.0)),
RandomHorizontalFlip(),
RandomRotation(degrees=15),
ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.1),
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
dataset = VideoFrameDataset(
root_path=self.video_path_prefix,
annotationfile_path=self.annotation_file_train,
num_segments=5,
frames_per_segment=1,
imagefile_template='frame_{:04d}.jpg',
transform=preprocess,
test_mode=False
)
loader = torch.utils.data.DataLoader(
dataset=dataset,
batch_size=self.batch_size,
shuffle=True,
num_workers=self.num_worker,
pin_memory=True
)
Thanks in advance :)
Hi there, this looks like a really good dataloader, but i was wondering why the bounding boxes werent loaded in the annotations, i have read through the read me briefly and it doesnt look like any bounding boxes are to be included within the annotations files. if this isnt the case please let me know so i can avoid doing extra work thank you.
in line 131 just replace frames with frame_tensor:
plot_video(rows=1, cols=5, frame_list= frame_tensor, plot_width=15., plot_height=3.)>
Hey,
great repo! thank you!
Just a question about annotations.txt. Am i right in the assumption that if I do NOT include a datapoint in annotations.txt, it will not be included in the dataset although it is in the same folder? Is this correct? That would be great for things like cross validation.
And by the way, the pytorch-lightning version that works with the repo in 2023 is 1.7.7(at least for me). Maybe include that in readme
All the best!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.