I trained the same ESPNetV2 on my GTX 1080 CUDA GPU for 10 class semantic segmentation

Performance Issues on NVIDIA GTX1080 about espnetv2 HOT 10 CLOSED

sacmehta commented on August 23, 2024

Performance Issues on NVIDIA GTX1080

from espnetv2.

Comments (10)

sacmehta commented on August 23, 2024

You should not include the image reading and writing time, because those operations are slow. As a standard convention, none of the models report inference time with image reading and writing operations.

from espnetv2.

nithishc829 commented on August 23, 2024

No sir I am not including them

ROI of code
start_time = time.time()
# run the mdoel
output1 = model(input)
time_taken = time.time() - start_time
time_list.append(time_taken)

Function used for inference time

def val(args, val_loader, model, criterion):
'''
:param args: general arguments
:param val_loader: loaded for validation dataset
:param model: model
:param criterion: loss function
:return: average epoch loss, overall pixel-wise accuracy, per class accuracy, per class iu, and mIOU
'''
#switch to evaluation mode
model.eval()

iouEvalVal = iouEval(args.classes)

epoch_loss = []
time_list = []
total_batches = len(val_loader)
blist = helpers.get_label_info_new(args.csvfile)
for i, (input, target) in enumerate(val_loader):
    
    if args.onGPU :
        print('Non_blocking')
        input = input.cuda(non_blocking=True) #torch.autograd.Variable(input, volatile=True)
        target = target.cuda(non_blocking=True)#torch.autograd.Variable(target, volatile=True)
    else:
        print('Blocking')
        input = input.cuda() #torch.autograd.Variable(input, volatile=True)
        target = target.cuda()#torch.autograd.Variable(target, volatile=True)

    start_time = time.time()
    # run the mdoel
    output1 = model(input)
    time_taken = time.time() - start_time
    time_list.append(time_taken)
    # compute the loss
    loss = criterion(output1, target)
    epoch_loss.append(loss.item())
    # compute the confusion matrix
    iouEvalVal.addBatch(output1.max(1)[1].data, target.data)
    print('[%d/%d] loss: %.3f time: %.4f' % (i, total_batches, loss.item(), time_taken))

average_epoch_loss_val = sum(epoch_loss) / len(epoch_loss)
overall_acc, per_class_acc, per_class_iu, mIOU = iouEvalVal.getMetric()
print('Average fps ',1/np.mean(np.array(time_list)))
return average_epoch_loss_val, overall_acc, per_class_acc, per_class_iu, mIOU

from espnetv2.

sacmehta commented on August 23, 2024

Do not include the first iteration because PyTorch has some initialization time

from espnetv2.

nithishc829 commented on August 23, 2024

Yes you are right sir. I saw that too. I removed the first 5 samples for profiling and still the average is 53 fps.

from espnetv2.

sacmehta commented on August 23, 2024

What version of cudnn and cuda are you using?

from espnetv2.

nithishc829 commented on August 23, 2024

CUDA
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

CUDNN version 7
libcudnn.so.7

from espnetv2.

sacmehta commented on August 23, 2024

Use below code to check cuDNN version. All versions does not have cuDNN optimized depth-wise convolutions.

import torch
print(torch.backends.cudnn.version())

from espnetv2.

nithishc829 commented on August 23, 2024

import torch
print(torch.backends.cudnn.version())
7402

from espnetv2.

sacmehta commented on August 23, 2024

I assume you are using PyTorch 0.4+. COuld you try making following changes to your code and see what happens:

device=torch.device('cuda')
with torch.no_grad():
      for i, (input, _) in enumerate(val_loader):
            input = input.to(device=device)
            start_time = time.time()
            # run the mdoel
            output1 = model(input)
            time_taken = time.time() - start_time
            time_list.append(time_taken)

I also noticed that your GPU is different than the one we used. TitanX is considerably faster than GTX 1080, so that explains the difference in speed.

from espnetv2.

nithishc829 commented on August 23, 2024

I did the changes you mentioned, however the fps is same. I understand that TitanX is faster. I thought some issue in implementation detail is causing this performance issue. But this is too much difference in fps i.e I got almost 3x less speed, I was not expecting this...

Anyhow, thanks for quick response and support. You can close this issue sir.
Really good paper sir.

How can I contact you ? for more information ?

from espnetv2.

Performance Issues on NVIDIA GTX1080 about espnetv2 HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent