xujiacong / pidnet Goto Github PK
View Code? Open in Web Editor NEWThis is the official repository for our recent work: PIDNet
License: MIT License
This is the official repository for our recent work: PIDNet
License: MIT License
Hi,
Thank you for sharing this work with the community!
I am trying to fine-tune the PIDNet on the KITTI dataset by using the 200 images that are provided in the link .
To this end, I did the following addition/changes:
I can run the code with the default parameters and on the cityscapes dataset but I get the following error when I run it on KITTI dataset:
...
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:145: operator(): block: [262,0,0], thread: [79,0,0] Assertionidx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
...
Traceback (most recent call last):
File "tools/train.py", line 218, in
main()
File "tools/train.py", line 182, in main
trainloader, optimizer, model, writer_dict)
File "/cluster/home/oilter/PIDNet/tools/../utils/function.py", line 43, in train
losses, _, acc, loss_list = model(images, labels, bd_gts)
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/cluster/home/oilter/PIDNet/tools/../utils/utils.py", line 49, in forward
loss_s = self.sem_loss(outputs[:-1], labels)
File "/cluster/home/oilter/miniconda3/envs/ss/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/cluster/home/oilter/PIDNet/tools/../utils/criterion.py", line 93, in forward
for (w, x, func) in zip(balance_weights, score, functions)
File "/cluster/home/oilter/PIDNet/tools/../utils/criterion.py", line 93, in
for (w, x, func) in zip(balance_weights, score, functions)
File "/cluster/home/oilter/PIDNet/tools/../utils/criterion.py", line 72, in _ohem_forward
pred, ind = pred.contiguous().view(-1,)[mask].contiguous().sort()
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Any help would be highly appreciated.
Hi, PIDNet is really interesting! Thanks for your kindly sharing!
I'am using BiSeNet for other tasks. In the paper, I see ADB-Bag brings 1.3 mIoU improvement for BiSeNet, so I want to add this module to the BiSeNet, but I'am not sure the details I implement are correct.
Could you please kindly release the codes of BiSeNet equipped with ADB and Bag?
To find contour in the label (datasets/base_dataset.py:109
), paper Fig4
, you use a Canny edge detector.
Canny dege detector are typically used for finding edge in a non uniform colored object.
exemple:
However, the cityscape label are uniformly colored object:
In such case, a topological contour finding algorithm would be more appropriate, for example cv2.findContours() . It would provide more accurate contours, for (maybe) cheaper computation
Is there a reason you choose to use canny edge detection instead of topological contour finding ?
Hello.
Recently, I'll trying to train this model with my custom dataset.
Before that, I ran the example code first. Semantic Loss and SB Loss are always 0, is this normal?
Other Acc or BCE Loss are output normally.
Excellent work!
Could you provide a link to download for the 11 category camvid dataset?
我只是想在自己的电脑上跑一次这个网络,但是调整了好久的配置,总是报错The gpu numbers do not match!
可以请教一下我应该调整一些什么参数,如何配置这个网络吗?
I tried training a Camvid dataset with Imagenet and Cityscapes' pretrained model but the best_mIoU on both are over 0.9 (I don't know it was a good result or overconfidence??)
So I want to know the mIoU on Camvid dataset without any pretrained model
I've changed the config file in MODEL.PRETRAINED into false. But I got error message below:
Traceback (most recent call last):
File "tools/train.py", line 218, in
main()
File "tools/train.py", line 50, in main
args = parse_args()
File "tools/train.py", line 44, in parse_args
update_config(config, args)
File "/home/ubuntu/workdir/nur/PIDNet/tools/../configs/default.py", line 94, in update_config
cfg.merge_from_file(args.cfg)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/yacs/config.py", line 213, in merge_from_file
self.merge_from_other_cfg(cfg)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/yacs/config.py", line 474, in _merge_a_into_b
v = _check_and_coerce_cfg_value_type(v, b[k], k, full_key)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/yacs/config.py", line 534, in _check_and_coerce_cfg_value_type
raise ValueError(
ValueError: Type mismatch (<class 'str'> vs. <class 'bool'>) with values (pretrained_models/imagenet/PIDNet_S_ImageNet.pth.tar vs. False) for config key: MODEL.PRETRAINED
Where should I troubleshoot the code(s) to train a Camvid dataset without any pretrained model?
Thanks
Hello,
Thank you for your work.
Did you try bigger architecture than PIDNet-L ?
If it is not the case, why didn't you try ? (and if it is the case, are the results interesting)?
I'm interested by a trade-off between speed and accuracy with the processing time of an BiSeNet - ResNet101 (which is sufficient in time processing, but has lower precision than PIDNet-L). Is a PIDNet-XL imaginable ?
Thank you for your great effort.
I am getting the following error when I run inference on image with different size than Cityscapes dataset.
RuntimeError: The size of tensor a (110) must match the size of tensor b (109) at non-singleton dimension 2
'
Should I resize the image before passing it to the model?
Hi, thanks for providing such a useful repo.
I was wondering if you can provide more details on how you finetune? From the paper you mention early stopping, but how to change the model? Do you just swap out the segement-head (prediction head) layers or do you also unfreeze any other layers?
Thanks!
我注意到您在表现不错的模型都使用了ImageNet的预训练模型,
而这部分并没有在您的代码中体现,
我尝试过自定义设计网络但表现并不是很好,
这也许跟一个良好的预训练有关系,
请问能否告诉我怎样在ImageNet上进行预训练,并告知相关细节,万分感谢。
Hello, I put some of my photos in data/cityscapes/images/train, data/cityscapes/images/val, data/cityscapes/images/train, data/cityscapes/label/train_labels and data/cityscapes/val_labels.
(Absolute path to png = D:\GraduateWork\PIDNet\data\cityscapes\images\train\0010.png,
to its label = D:\GraduateWork\PIDNet\data\cityscapes\labels\train_labels\0010.png, and so on)
Then, i did some lst files: train.lst and val.lst
But when i try to start the train, i get an error: NotImplementedError: Got <class 'NoneType'>, but expected numpy array or torch tensor.
As far as i understand, the error is in the files .lst, im most likely pointing the wrong path's. But I don't understand what exactly I'm doing wrong.
I tried full paths, for example 1 row of train.lst file:
D:\GraduateWork\PIDNet\data\cityscapes\images\train\0010.png D:\GraduateWork\PIDNet\data\cityscapes\labels\train_labels\0010.png
tried paths, starts from a folder data/:
cityscapes/images/train/0010.png cityscapes/labels/train_labels/0010.png
Or including it:
data/cityscapes/images/train/0010.png data/cityscapes/labels/train_labels/0010.png
Tell me please what am I doing wrong? How to correctly write paths to images in .lst files?
PS im using pidnet_large_cityscapes.yaml , everything inside is right:
DATASET:
DATASET: cityscapes
ROOT: data/
TEST_SET: 'list/cityscapes/val.lst'
TRAIN_SET: 'list/cityscapes/train.lst'
NUM_CLASSES: 19
Hi, thank you for sharing your work. I tried to train the model on my custom dataset derived from CamVid and I'm getting error during calculating the loss function when target consists of only background class (255). The error appears on 73th line in utils/criterion.py.
pred = pred.gather(1, tmp_target.unsqueeze(1))
pred, ind = pred.contiguous().view(-1,)[mask].contiguous().sort()
min_value = pred[min(self.min_kept, pred.numel() - 1)] # --- here is the error because pred is an empty array and mask only consist of False values
threshold = max(min_value, self.thresh)
Hello,
I am facing the same error. Can someone tell me exactly what to do? I tried to translate the above answer, and also tried model = model.cuda() this did not work for me. The screen shot of my error is as shown below. And, the details about my graphics card are also shown below in image.
Error message
Graphics card and CUDA details
I have noticed same issue reported by other users:
#40
Also, @XuJiacong, I have seen that you have mentioned the hardware and framework on which the inference was done. Can you comment about the hardware and other framework requirements used while training? Or simply upload a requirements.txt
file in the new commit to the repo?
Lines 114 to 128 in 803851b
Hi!,
In validate function inside the utils/function.py, the code calculates and logs IoU:
for i in range(nums):
pos = confusion_matrix[..., i].sum(1)
res = confusion_matrix[..., i].sum(0)
tp = np.diag(confusion_matrix[..., i])
IoU_array = (tp / np.maximum(1.0, pos + res - tp))
mean_IoU = IoU_array.mean()
logging.info('{} {} {}'.format(i, IoU_array, mean_IoU))
writer = writer_dict['writer']
global_steps = writer_dict['valid_global_steps']
writer.add_scalar('valid_loss', ave_loss.average(), global_steps)
writer.add_scalar('valid_mIoU', mean_IoU, global_steps)
writer_dict['valid_global_steps'] = global_steps + 1
return ave_loss.average(), mean_IoU, IoU_array
The part that I dont understand is, I think the nums variable holds the number of GPUs and code only reports the output of the second GPU. If that is the case, why? Shouldn't be average of two?
Thank you for sharing the code, It is great work
I wonder if there any guiding scripts for training custom dataset with only one class.
Hi, thank you for your excellent work,I used the model you trained in cityscapes as a pre training model to learn a new task with only one category. The label format of the data is a binary grayscale image, where the background pixel value is 0, and the foreground pixel value is 255. I rewrote cityscapes.py to adapt to the new dataset. The main changes are as follows:
# ignore_label=255
self.label_mapping = {0: ignore_label,
255: 0} # cloud: 0, bg:255
self.class_weights = None
And reset the. yaml file, NUM_CLASSES=1,
But the output loss is nan, I don't know why. Do the authors have any suggestions?
log:
Epoch: [0/484] Iter:[0/40], Time: 1.96, lr: [0.01], Loss: nan, Acc:0.516756, Semantic loss: nan, BCE loss: 3.564931, SB loss: nan
Traceback (most recent call last):
File "tools/train.py", line 218, in
main()
File "tools/train.py", line 182, in main
trainloader, optimizer, model, writer_dict)
File "/home/shen/network/cloud_detection/PIDNet-main/tools/../utils/function.py", line 43, in train
losses, _, acc, loss_list = model(images, labels, bd_gts)
File "/home/shen/software/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/shen/software/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/shen/software/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/shen/network/cloud_detection/PIDNet-main/tools/../utils/utils.py", line 54, in forward
loss_sb = self.sem_loss(outputs[-2], bd_label)
File "/home/shen/software/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/shen/network/cloud_detection/PIDNet-main/tools/../utils/criterion.py", line 96, in forward
return sb_weights * self._ohem_forward(score[0], target)
File "/home/shen/network/cloud_detection/PIDNet-main/tools/../utils/criterion.py", line 73, in _ohem_forward
min_value = pred[min(self.min_kept, pred.numel() - 1)]
IndexError: index -1 is out of bounds for dimension 0 with size 0
您好,我在加载cityscapes_M_pretrained预训练模型做测试时,发现logger中显示我load 0 parameters,于是我找到下面的代码
if imgnet_pretrained:
pretrained_state = torch.load(cfg.MODEL.PRETRAINED, map_location='cpu')['state_dict']
model_dict = model.state_dict()
pretrained_state = {k: v for k, v in pretrained_state.items() if (k in model_dict and v.shape == model_dict[k].shape)}
model_dict.update(pretrained_state)
msg = 'Loaded {} parameters!'.format(len(pretrained_state))
logging.info('Attention!!!')
logging.info(msg)
logging.info('Over!!!')
model.load_state_dict(model_dict, strict = False)
else:
pretrained_dict = torch.load(cfg.MODEL.PRETRAINED, map_location='cpu')
if 'state_dict' in pretrained_dict:
pretrained_dict = pretrained_dict['state_dict']
model_dict = model.state_dict()
pretrained_dict = {k[6:]: v for k, v in pretrained_dict.items() if (k[6:] in model_dict and v.shape == model_dict[k[6:]].shape)}
msg = 'Loaded {} parameters!'.format(len(pretrained_dict))
logging.info('Attention!!!')
logging.info(msg)
logging.info('Over!!!')
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict, strict = False)
对于 imgnet_pretrained为false时, pretrained_dict的k为什么要从序号6开始,这是我load 0 params的原因吗,期待您的解答,谢谢
I encountered this problem while running train.py. The training ended shortly after the normal start, which has been bothering me for a long time. Has anyone encountered this problem?
I only modify the following parameters for the config configuration file:
GPUS (My GPU is 3060 or 6G, with a single GPU and 16G of computer memory);
TRAIN.BATCH_ SIZE_ PER_ GPU =36 ----> 3;
workers = 6 ---> 4
I can't find where the semantic loss generated
Hello, thank you for your research and papers.
I have two questions for you.
Thank you.
Hi,
I've been having this issue on multiple machines, when I start training a model (custom dataset), the training would just hang in the middle, without any error. It just stops working, the GPU temperature goes down and no progress in the epoch/iterations is observed.
Sometimes it happens after 20 epochs and once it managed to get to 150 and then stopped.
Have anyone seen something similar? I suspect it might be related to my CUDA/PyTorch version (?) what versions would you recommend?
Thanks!
Line 282-283, models/others/ddrnet23_adb_bag.py
#self.bag = model_utils.BagFM(planes * 4, planes * 2, planes * 4)
self.dfm = model_utils.DFM3(planes * 4, planes * 4)
Neither of this classes are available.
Thank you @XuJiacong for your wonderful works on PIDNet. I want to train a cityscapes like dataset with only one class, it has gtFine label ids masks and the leftImg8bit images. I have an issue on how to modify the cityscapes.py to match my dataset with only one class
self.label_mapping = {-1: ignore_label, 0: ignore_label,
1: ignore_label, 2: ignore_label,
3: ignore_label, 4: ignore_label,
5: ignore_label, 6: ignore_label,
7: 0, 8: 1, 9: ignore_label,
10: ignore_label, 11: 2, 12: 3,
13: 4, 14: ignore_label, 15: ignore_label,
16: ignore_label, 17: 5, 18: ignore_label,
19: 6, 20: 7, 21: 8, 22: 9, 23: 10, 24: 11,
25: 12, 26: 13, 27: 14, 28: 15,
29: ignore_label, 30: ignore_label,
31: 16, 32: 17, 33: 18}
self.class_weights = torch.FloatTensor([0.8373, 0.918, 0.866, 1.0345,
1.0166, 0.9969, 0.9754, 1.0489,
0.8786, 1.0023, 0.9539, 0.9843,
1.1116, 0.9037, 1.0865, 1.0955,
1.0865, 1.1529, 1.0507]).cuda()
I have changed the number of classes in the configuration to 1 and I also changed this part of the code to this
self.label_mapping = {-1: ignore_label, 0: ignore_label}
self.class_weights = torch.FloatTensor([0.8373]).cuda()
It did not work and it raised this error errorcu:111: block: [211,0,0], thread: [96,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
I saw issue here #2 but my case is a bit different.
I want to learn training a video with only two class and a have a groudtruth for the video.
Do you have a simple way to do it?
-Thanks-
I find that all the operations of BatchNorm2d is commented during testing speed in the file pidnet_speed.py. Whether commenting out BatchNorm2d has no effect on the result?
For example, #BatchNorm2d(planes, momentum=bn_mom)
.
Hello! Your work is excellent. I am reproducing your code now, and I can train successfully, but I don’t know how to display the excellent segmentation results of your network. I tried to run the custom.py file but failed. How can I visualize the segmentation results? Can you please Do you provide detailed visualization source code?
Hi @XuJiacong , thanks for this resource.
For the final project of an Intro to ML course I'm doing I want to develop a segmentation model for flooded roads. I have a labeled dataset of around 60 images of a road in North Carolina. I have around 8 classes in the images, see an example attached.
As a starting point, I would like to use your pretrained PIDNet-M model on CamVid. Do you have a code for doing this, or something similar? If not, can you please tell me if defining the model, and then loading the weights (https://tinyurl.com/b8w8wbrb) as state_dict is the correct way to proceed? Can you please also recommend to me what parts of the net to modify for the task I have? Unfortunately, I don't have a large computer to experiment with.
Thank you very much!
Tomas
在多块gpu上训练时,第二个epoch开始显存爆炸
Hello
How are you?
Thanks for contributing to this project.
I want to customize PIDNet network in order to use it on a mobile device.
I think that it is difficult to run even PIDNet-S model on a mobile device.
Is there any solution to reduce PIDNet network size?
Thanks
When training with training set, the accuracy of verification set can reach 78.5, while when training from scratch with training set and verification set, the accuracy can only reach 77.2 on the test server. What is the reason
Please provide requirements YAML or text file.
您好,我注意到你在文章的图片展示部分,把分割忽略标签的部分设置成了黑色,我觉得这样展示起来更加直观。
请问是如何做到的,您能公布一下代码吗?
我的邮箱是[email protected]
Thank you so much @XuJiacong for your work in creating PIDNET. I want to test the pretrained model on images and videos but there is no straightforward code on doing that. Looking forward to your response.
Hello! Thank you for open-sourcing this work 😊 I was wondering if you'd be interested in mirroring your pretrained checkpoints over on the Hugging Face model hub? It would help our users find your work, which I'm sure they'd love to tinker with! If you'd like, there could also be a hosted interactive demo on Hugging Face Spaces. We've found that having demos to showcase models has been really impactful, and some work has even gone quite viral (e.g. Stable Diffusion).
We've got guides on how to upload models and on how to create spaces, but I'm more than happy to help you out with doing it! Let me know if you're interested 🤗
Hello, thanks for sharing this good work. Could you please explain how did you visualize the feature in Figure.8 and Figure 9?
您好!我们在多卡训练时发现,当进入validate阶段时0号显卡显存显著增大(对比其他卡十分不均衡),这里请问您也有遇到吗,有什么好的解决方式吗?
Thanks for your work and providing code. Is it possible to train it on single GPU using cityscapes dataset?
Is there a python API for users ?
There seems to be no code for experimenting with Pascal Context Dataset.
Can you release the code for the experiments?
And have you tried experiment with PIDNet-S in Pascal Context dataset?
I believe a relu activation function in _make_layer and _make_single_layer is missing.
In both _make_layer and _make_single_layer method, after batchnorm of downsample there should be an activation function, In previous work, they have used it.
Can you reconfirm this?
I couldn't find 'boundary generation code' in this repository.
please check this issues.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.