I was trying to run training for UCF-101 RGB split-1 but the model seems to be running

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thank you very much. lt is ok now. Thanks for your reply. <a class="user-mention

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Running out of memory about tsn-pytorch HOT 8 CLOSED

yjxiong commented on July 21, 2024

Running out of memory

from tsn-pytorch.

Comments (8)

yjxiong commented on July 21, 2024 2

Hi, we usually use 4 GPUs to train TSN. If you put the minibatch of 128 samples on one GPU it will take about 24GB memory. So in your case you can try lower the batchsize like -b 64.

from tsn-pytorch.

yjxiong commented on July 21, 2024 1

It was written before 0.2.0 release but has been tested under the 0.2.0 version.

from tsn-pytorch.

yjxiong commented on July 21, 2024 1

@woshihucheng

This is possible. Please try setting the flag --test_crops 1.

from tsn-pytorch.

utsavgarg commented on July 21, 2024

Okay thanks and what PyTorch version are you using ?

from tsn-pytorch.

woshihucheng commented on July 21, 2024

Hi @yjxiong
l am ready to test the model, it also out of memory.I am using a GPU with 16 GB VRAM. Do you have any ideas ?
my command is :
sudo python test_models.py hmdb51 RGB hmdb51_rgb_val_split_1.txt hmdb51_bninception_rgb_checkpoint.pth.tar --arch BNInception --save_scores rgbscores --workers 1

The result are :

model epoch 80 best prec@1: 52.3529411864
Freezing BatchNorm2D except the first one.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "test_models.py", line 128, in
rst = eval_video((i, data, label))
File "test_models.py", line 116, in eval_video
rst = net(input_var).data.cpu().numpy().copy()
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 58, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/tsn/tsn-pytorch/models.py", line 197, in forward
base_out = self.base_model(input.view((-1, sample_len) + input.size()[-2:]))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/tsn/tsn-pytorch/tf_model_zoo/bninception/pytorch_load.py", line 49, in forward
data_dict[op[2]] = getattr(self, op[0])(data_dict[op[-1]])
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/pooling.py", line 505, in forward
self.padding, self.ceil_mode, self.count_include_pad)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 264, in avg_pool2d
ceil_mode, count_include_pad)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/thnn/pooling.py", line 360, in forward
ctx.ceil_mode, ctx.count_include_pad)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66

from tsn-pytorch.

woshihucheng commented on July 21, 2024

Thank you very much. lt is ok now. Thanks for your reply. @yjxiong

from tsn-pytorch.

dandingol03 commented on July 21, 2024

@utsavgarg hi, i train ucf101 on the google colab, and also limit the batch-size to 64. However, the last prec1 accuracy is 84%, i was wondering what accuracy you finally get. Thanks in advance

from tsn-pytorch.

HaneenElyamani commented on July 21, 2024

@dandingol03
please i want know , what kind the google colab did you used (free, pro or pro+)

from tsn-pytorch.

Running out of memory about tsn-pytorch HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent