Giter Club home page Giter Club logo

Comments (4)

pengsida avatar pengsida commented on August 14, 2024

CUDA_LAUNCH_BLOCKING=1 python -m pdb tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls db
When the error encountered, print loss_seg's shape.

from pvnet.

F-jie avatar F-jie commented on August 14, 2024

following your suggestion, the output is like this:
(Pdb) c
torch.Size([4, 304, 456])
tensor([0.4812, 0.4532, 0.4802, 0.4539], device='cuda:0', grad_fn=)
torch.Size([4, 320, 392])
tensor([0.4567, 0.4309, 0.4438, 0.4450], device='cuda:0', grad_fn=)
torch.Size([4, 432, 312])
tensor([0.4205, 0.4445, 0.4547, 0.4104], device='cuda:0', grad_fn=)
torch.Size([4, 320, 632])
THCudaCheck FAIL file=/pytorch/aten/src/THC/generated/../generic/THCTensorMathReduce.cu line=18 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/pdb.py", line 1667, in main
pdb._runscript(mainpyfile)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/pdb.py", line 1548, in _runscript
self.run(statement)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/bdb.py", line 431, in run
exec(cmd, globals, locals)
File "", line 1, in
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 374, in
train_net()
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 358, in train_net
train(net, optimizer, train_loader, epoch)
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 150, in train
seg_pred, vertex_pred, loss_seg, loss_vertex, precision, recall = net(image, mask, vertex, vertex_weights)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 91, in forward
loss_seg = torch.mean(loss_seg.view(loss_seg.shape[0],-1),1)
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generated/../generic/THCTensorMathReduce.cu:18
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program

/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py(91)forward()
-> loss_seg = torch.mean(loss_seg.view(loss_seg.shape[0],-1),1)
(Pdb) print(loss_seg.shape)
torch.Size([4, 320, 632])
(Pdb)

from pvnet.

F-jie avatar F-jie commented on August 14, 2024

Could tell me what may cause the error! Did you ever encounter this problem?Thanks @pengsida

from pvnet.

F-jie avatar F-jie commented on August 14, 2024

Thanks! I finally solved this problem following the suggestions in masks are having out-of-bounds memory accesses .

from pvnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.