Giter Club home page Giter Club logo

Comments (8)

ycszen avatar ycszen commented on July 24, 2024

Hi, according to my experience, this is manily because the value of label is out of the range 0 - class_num. Thus, you should check if there is negative value or value greater than the class_number by printing the value.

from torchseg.

hubutui avatar hubutui commented on July 24, 2024

@ycszen Hi, I try Cityscapes dataset. And the value of the label is out of the range 0-class_num. It seems that you don't do the label_id to idx job in cityscape dataset? I remap label to range 0-class_num, but still get this error:

26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
26 21:25:43 PyTorch Version 1.0.1.post2, Furnace Version 0.1.1
[00:00<?,?it/s]Traceback (most recent call last):
  File "train.py", line 123, in <module>
    loss = model(imgs, gts)
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/apex-0.1-py3.5-linux-x86_64.egg/apex/parallel/distributed.py", line 459, in forward
    result = self.module(*inputs, **kwargs)
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/Projects/TorchSeg-BiSeNet/model/bisenet/cityscapes.bisenet.R18/network.py", line 105, in forward
    aux_loss0 = self.ohem_criterion(self.heads[0](pred_out[0]), label)
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/USER/Projects/TorchSeg-BiSeNet/furnace/seg_opr/loss_opr.py", line 84, in forward
    index = mask_prob.argsort()
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/tensor.py", line 248, in argsort
    return torch.argsort(self, dim, descending)
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/functional.py", line 648, in argsort
    return torch.sort(input, -1, descending)[1]
RuntimeError: CUDA error: device-side assert triggered

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 157, in <module>
    config.log_dir_link)
  File "/home/USER/Projects/TorchSeg-BiSeNet/furnace/engine/engine.py", line 154, in __exit__
    torch.cuda.empty_cache()
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/__init__.py", line 374, in empty_cache
    torch._C._cuda_emptyCache()
RuntimeError: CUDA error: device-side assert triggered

Exception ignored in: <bound method Event.__del__ of <torch.cuda.Event 0x34cf360>>
Traceback (most recent call last):
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/streams.py", line 167, in __del__
  File "/home/USER/BiSeNet-official-env/lib/python3.5/site-packages/torch/cuda/__init__.py", line 208, in check_error
torch.cuda.CudaError: device-side assert triggered (59)
terminate called without an active exception

Any suggestion?

from torchseg.

ycszen avatar ycszen commented on July 24, 2024

@hubutui Could you show more details? Did you do a correct remap?

from torchseg.

hubutui avatar hubutui commented on July 24, 2024

@ycszen Hi, I use skimage.segmentation.relabel_sequential to remap label:

diff --git a/furnace/datasets/BaseDataset.py b/furnace/datasets/BaseDataset.py
index 7d8f6ef..bfcaa6b 100755
--- a/furnace/datasets/BaseDataset.py
+++ b/furnace/datasets/BaseDataset.py
@@ -10,6 +10,7 @@ import time
 import cv2
 import torch
 import numpy as np
+from skimage.segmentation import relabel_sequential
 
 import torch.utils.data as data
 
@@ -62,6 +63,15 @@ class BaseDataset(data.Dataset):
         if self.preprocess is not None and extra_dict is not None:
             output_dict.update(**extra_dict)
 
+        gt = gt.cpu().numpy()
+        trans_labels = [7, 8, 11, 12, 13, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27,
+                        28, 31, 32, 33, 0]
+        trans_labels = np.array(trans_labels)
+        _, fw, inv = relabel_sequential(trans_labels)
+        gt[gt==255] = 0
+        gt = fw[gt]
+        gt = torch.from_numpy(np.ascontiguousarray(gt)).long()
+        output_dict['label'] = gt
         return output_dict
 
     def _fetch_data(self, img_path, gt_path, dtype=None):

from torchseg.

hanamizukigakki avatar hanamizukigakki commented on July 24, 2024

Hi, have you solved this problem? I have got the same question with above, the Error code is some thing as below,

File "train.py", line 137, in
loss = model(imgs, gts)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
output = module(*input, **kwargs)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/PyCharm_Projs/TorchSeg/model/bisenet/cityscapes.bisenet.R18.speed/network.py", line 109, in forward
aux_loss0 = self.ohem_criterion(self.heads0, label)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/PyCharm_Projs/TorchSeg/furnace/seg_opr/loss_opr.py", line 84, in forward
index = mask_prob.argsort()
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/tensor.py", line 248, in argsort
return torch.argsort(self, dim, descending)
File "/root/conda/envs/conda_py36/lib/python3.6/site-packages/torch/functional.py", line 648, in argsort
return torch.sort(input, -1, descending)[1]
RuntimeError: merge_sort: failed to synchronize: device-side assert triggered
], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.

Hope suggestions. Thanks.

from torchseg.

hanamizukigakki avatar hanamizukigakki commented on July 24, 2024

I have solved the question by mapping the labels with cityscapesscripts. Thanks a lot.

from torchseg.

jhch1995 avatar jhch1995 commented on July 24, 2024

@blueardour hi, I meet the same problem with you. Can you tell me the way to solve this question? Thanks a lot.

from torchseg.

tkingcer avatar tkingcer commented on July 24, 2024

@jhch1995 it might caused by the wrong match bewteen label ids. You can try createTrainIdLabelImgs.py to generate ***_gtFine_labelTrainIds.png files and use these files as training labels

from torchseg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.