Comments (4)
CUDA_LAUNCH_BLOCKING=1 python -m pdb tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls db
When the error encountered, print loss_seg's shape.
from pvnet.
following your suggestion, the output is like this:
(Pdb) c
torch.Size([4, 304, 456])
tensor([0.4812, 0.4532, 0.4802, 0.4539], device='cuda:0', grad_fn=)
torch.Size([4, 320, 392])
tensor([0.4567, 0.4309, 0.4438, 0.4450], device='cuda:0', grad_fn=)
torch.Size([4, 432, 312])
tensor([0.4205, 0.4445, 0.4547, 0.4104], device='cuda:0', grad_fn=)
torch.Size([4, 320, 632])
THCudaCheck FAIL file=/pytorch/aten/src/THC/generated/../generic/THCTensorMathReduce.cu line=18 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/pdb.py", line 1667, in main
pdb._runscript(mainpyfile)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/pdb.py", line 1548, in _runscript
self.run(statement)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/bdb.py", line 431, in run
exec(cmd, globals, locals)
File "", line 1, in
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 374, in
train_net()
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 358, in train_net
train(net, optimizer, train_loader, epoch)
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 150, in train
seg_pred, vertex_pred, loss_seg, loss_vertex, precision, recall = net(image, mask, vertex, vertex_weights)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/tender/anaconda3/envs/pvnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py", line 91, in forward
loss_seg = torch.mean(loss_seg.view(loss_seg.shape[0],-1),1)
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generated/../generic/THCTensorMathReduce.cu:18
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
/home/tender/deep_learning/pose_estimation/pv/pvnet/tools/train_linemod.py(91)forward()
-> loss_seg = torch.mean(loss_seg.view(loss_seg.shape[0],-1),1)
(Pdb) print(loss_seg.shape)
torch.Size([4, 320, 632])
(Pdb)
from pvnet.
Could tell me what may cause the error! Did you ever encounter this problem?Thanks @pengsida
from pvnet.
Thanks! I finally solved this problem following the suggestions in masks are having out-of-bounds memory accesses .
from pvnet.
Related Issues (20)
- ImportError: libspqr.so.2.0.2: cannot open shared object file: No such file or directory HOT 5
- A little tip to author HOT 1
- 数据集问题 HOT 1
- 环境问题 HOT 1
- lib中文件缺失 HOT 1
- 关于compute_vertex()函数中的问题请教 HOT 1
- Question about the download link for datasets HOT 1
- Invalid datasets link HOT 1
- 关于训练的物体泛化性的疑惑 HOT 1
- NameError: name 'Resnet18_8s' is not defined
- File "/home/mona/anaconda3/envs/pvnet/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension HOT 1
- Can not download the datasets HOT 1
- error with build_ceres.sh HOT 1
- cat demo
- replace torch.gesv with torch.linalg.solve
- visualize_voting_ellipse fails in visualization.ipynb HOT 1
- import lib.ransac_voting_gpu_layer.ransac_voting as ransac_voting ImportError: /home/mona/pvnet/lib/ransac_voting_gpu_layer/ransac_voting.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail19maybe_wrap_dim_slowEllb
- 如果我想针对PVNet BN层进行剪枝,我应该怎么做
- conda环境安装了transforms3d但是运行脚本时出现No module错误
- can i run the pvnet on a windows evironemnt of to google colab
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pvnet.