Comments (8)
Hi, according to my experience, this is manily because the value of label is out of the range 0 - class_num
. Thus, you should check if there is negative value or value greater than the class_number by printing the value.
from torchseg.
@ycszen Hi, I try Cityscapes dataset. And the value of the label is out of the range 0-class_num
. It seems that you don't do the label_id to idx job in cityscape dataset
? I remap label to range 0-class_num
, but still get this error:
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
[00:00<?,?it/s]Traceback (most recent call last):
File "train.py", line 123, in <module>
loss = model(imgs, gts)
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/apex-0.1-py3.5-linux-x86_64.egg/apex/parallel/distributed.py", line 459, in forward
result = self.module(*inputs, **kwargs)
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/USER/Projects/TorchSeg-BiSeNet/model/bisenet/cityscapes.bisenet.R18/network.py", line 105, in forward
aux_loss0 = self.ohem_criterion(self.heads[0](pred_out[0]), label)
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/USER/Projects/TorchSeg-BiSeNet/furnace/seg_opr/loss_opr.py", line 84, in forward
index = mask_prob.argsort()
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/tensor.py", line 248, in argsort
return torch.argsort(self, dim, descending)
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/functional.py", line 648, in argsort
return torch.sort(input, -1, descending)[1]
RuntimeError: CUDA error: device-side assert triggered
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 157, in <module>
config.log_dir_link)
File "/home/USER/Projects/TorchSeg-BiSeNet/furnace/engine/engine.py", line 154, in __exit__
torch.cuda.empty_cache()
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/__init__.py", line 374, in empty_cache
torch._C._cuda_emptyCache()
RuntimeError: CUDA error: device-side assert triggered
Exception ignored in: <bound method Event.__del__ of <torch.cuda.Event 0x34cf360>>
Traceback (most recent call last):
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/streams.py", line 167, in __del__
File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/__init__.py", line 208, in check_error
torch.cuda.CudaError: device-side assert triggered (59)
terminate called without an active exception
Any suggestion?
from torchseg.
@hubutui Could you show more details? Did you do a correct remap?
from torchseg.
@ycszen Hi, I use skimage.segmentation.relabel_sequential
to remap label:
diff --git a/furnace/datasets/BaseDataset.py b/furnace/datasets/BaseDataset.py
index 7d8f6ef..bfcaa6b 100755
--- a/furnace/datasets/BaseDataset.py
+++ b/furnace/datasets/BaseDataset.py
@@ -10,6 +10,7 @@ import time
import cv2
import torch
import numpy as np
+from skimage.segmentation import relabel_sequential
import torch.utils.data as data
@@ -62,6 +63,15 @@ class BaseDataset(data.Dataset):
if self.preprocess is not None and extra_dict is not None:
output_dict.update(**extra_dict)
+ gt = gt.cpu().numpy()
+ trans_labels = [7, 8, 11, 12, 13, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27,
+ 28, 31, 32, 33, 0]
+ trans_labels = np.array(trans_labels)
+ _, fw, inv = relabel_sequential(trans_labels)
+ gt[gt==255] = 0
+ gt = fw[gt]
+ gt = torch.from_numpy(np.ascontiguousarray(gt)).long()
+ output_dict['label'] = gt
return output_dict
def _fetch_data(self, img_path, gt_path, dtype=None):
from torchseg.
Hi, have you solved this problem? I have got the same question with above, the Error code is some thing as below,
File "train.py", line 137, in
loss = model(imgs, gts)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/PyCharm_Projs/TorchSeg/model/bisenet/cityscapes.bisenet.R18.speed/network.py", line 109, in forward
aux_loss0 = self.ohem_criterion(self.heads0, label)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/PyCharm_Projs/TorchSeg/furnace/seg_opr/loss_opr.py", line 84, in forward
index = mask_prob.argsort()
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/tensor.py", line 248, in argsort
return torch.argsort(self, dim, descending)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/functional.py", line 648, in argsort
return torch.sort(input, -1, descending)[1]
RuntimeError: merge_sort: failed to synchronize: device-side assert triggered
], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
Hope suggestions. Thanks.
from torchseg.
I have solved the question by mapping the labels with cityscapesscripts. Thanks a lot.
from torchseg.
@blueardour hi, I meet the same problem with you. Can you tell me the way to solve this question? Thanks a lot.
from torchseg.
@jhch1995 it might caused by the wrong match bewteen label ids. You can try createTrainIdLabelImgs.py to generate ***_gtFine_labelTrainIds.png files and use these files as training labels
from torchseg.
Related Issues (20)
- train a dfn-voc network with my own data HOT 1
- 关于cityscapes.bisenet.R18.speed中下采样label HOT 1
- subprocess.CalledProcessError
- example about the train.txt, val.txt, test.txt
- A exception occurred during Engine initialization, give up running process HOT 1
- about aux_label
- C++
- 关于您提供的BiSeNet_Xception39的复现结果mIOU低
- Inference on my own images HOT 1
- Setting gt_down_sample=1 resulting in reduction of validation accuracy by 4.9% HOT 1
- Where are train.py and eval.py? HOT 1
- Does the .pth file downloaded in google drive need to be retrained?
- The loss calculation of PSPNet
- BiSeNetV2 support
- Training Parameter - Large dataset
- Difference between realtime res18 and non-realtime res18 model HOT 1
- About the pretrained model
- ModuleNotFoundError: No module named 'utils.pyt_utils' HOT 3
- 训练问题
- valueError invalid literal for int() with base10:‘’
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchseg.