The pytorch-retraining's discuss from ahirner

num_batches_tracked error

In pytorch after 0.4.1 there is a num_batches_tracked layer in BN and not in model_zoo so there are errors in diff_states.

Enhancement: Save Model & Predict

What would be the best way to save models and then use saved model to make predictions?

For own data.

I have a dataset consisting of 20 classes. How can I use this code for my dataset. Thanks in advanced.
Train-
class1
1.jpg

class2
              1.jpg

Test-
class1
1.jpg

   class2
             1.jpg

Targeting densenet201 with 2 classes

Traceback (most recent call last):
File "retrain.py", line 340, in
model_pretrained, diff = load_model_merged(name, num_classes)
TypeError: 'DenseNet' object is not iterable

Thanks for this wonderful script. It is really helpful when testing various models!
I have issue of running out of memory in GPU. I know that this is NOT exactly a bug too. This is a CUDA memory issue.

Is there any way to reduce GPU memory usage. I only have 2 GB on my Geforce GTX 1050.

Only happens when training from scratch and training Deep

This is the error:

[29, 30] loss: nan [0.0044375000000000005]
[30, 30] loss: nan [0.0043333333333333392]
[31, 30] loss: nan [0.0011041666666666609]
[32, 30] loss: nan [0.0041250000000000002]
Finished Training
Evaluating...
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "retrain.py", line 380, in
CLR=use_clr)
File "retrain.py", line 322, in train_eval
stats_eval = evaluate_stats(net, testloader)
File "retrain.py", line 304, in evaluate_stats
outputs = net(Variable(images))
File "/usr/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 58, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/lib/python3.6/site-packages/torchvision/models/inception.py", line 81, in forward
x = self.Conv2d_2b_3x3(x)
File "/usr/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/lib/python3.6/site-packages/torchvision/models/inception.py", line 325, in forward
x = self.bn(x)
File "/usr/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/lib64/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 37, in forward
self.training, self.momentum, self.eps)
File "/usr/lib64/python3.6/site-packages/torch/nn/functional.py", line 639, in batch_norm
return f(input, weight, bias)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66
[tomppa@localhost pytorch-retraining]$

nvidia-smi

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1405 G /usr/libexec/Xorg 18MiB |
| 0 1444 G /usr/bin/gnome-shell 42MiB |
| 0 1776 G /usr/libexec/Xorg 114MiB |
| 0 1870 G /usr/bin/gnome-shell 87MiB |
| 0 6652 G gnome-control-center 1MiB |
| 0 7139 C python3 1665MiB |
+-----------------------------------------------------------------------------+

CUDA version:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

ahirner / pytorch-retraining Goto Github PK

pytorch-retraining's Issues

num_batches_tracked error

Enhancement: Save Model & Predict

For own data.

Densenet not iterable

Targeting densenet201 with 2 classes

CUDA running out of memory

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent