xinyiying / lesps Goto Github PK

Repository for "Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision", CVPR, 2023

Home Page: https://xinyiying.github.io/LESPS/

Python 97.64% MATLAB 2.36%

lesps's Introduction

Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision

Pytorch implementation of our Label Evolution with Single Point Supervision (LESPS). [Paper] [Web]

News: We recommend our newly-released repository BasicIRSTD, an open-source and easy-to-use toolbox for infrared small target detction. [link]

Overview

The Mapping Degeneration Phenomenon

Fig. 1. Illustrations of mapping degeneration under point supervision. CNNs always tend to segment a cluster of pixels near the targets with low confidence at the early stage, and then gradually learn to predict groundtruth point labels with high confidence.

Fig. 2. Quantitative and qualitative illustrations of mapping degeneration in CNNs.

The Label Evolution Framework

Fig. 3. Illustrations of Label Evolution with Single Point Supervision (LESPS). During training, intermediate predictions of CNNs are used to progressively expand point labels to mask labels. Black arrows represent each round of label updates.

Requirements

Python 3
pytorch (1.2.0), torchvision (0.4.0) or higher
numpy, PIL

Datasets

SIRST3 is used for training, and is a combination of NUAA-SIRST, NUDT-SIRST, IRSTD-1K datasets. Please first download datasets via Baidu Drive (key:1113), and place the datasets to the folder ./datasets/.

To gnenrate centroid annoation, run matlab code centroid_anno.m

To gnenrate coarse annoation, run matlab code coarse_anno.m

Our project has the following structure:

├──./datasets/SIRST3/
│    ├── images
│    │    ├── XDU0.png
│    │    ├── Misc_1.png
│    │    ├── ...
│    │    ├── 001327.png
│    ├── masks
│    │    ├── XDU0.png
│    │    ├── Misc_1.png
│    │    ├── ...
│    │    ├── 001327.png
│    ├── masks_centroid
│    │    ├── XDU0.png
│    │    ├── Misc_1.png
│    │    ├── ...
│    │    ├── 001327.png
│    ├── masks_coarse
│    │    ├── XDU0.png
│    │    ├── Misc_1.png
│    │    ├── ...
│    │    ├── 001327.png
│    ├── img_idx
│    │    ├── train_SIRST3.txt
│    │    ├── test_SIRST3.txt  
│    │    ├── test_NUAA-SIRST.txt
│    │    ├── test_NUDT-SIRST.txt
│    │    ├── test_IRSTD-1K.txt

Train

python train.py --model_names DNANet ALCNet ACM --dataset_names SIRST3 --label_type 'centroid'

python train.py --model_names DNANet ALCNet ACM --dataset_names SIRST3 --label_type 'coarse'

Test

python test.py --model_names DNANet ALCNet ACM --pth_dirs None --dataset_names NUAA-SIRST NUDT-SIRST IRSTD-1K

python test.py --model_names DNANet ALCNet ACM --pth_dirs SIRST3/DNANet_full.pth.tar SIRST3/DNANet_LESPS_centroid.pth.tar SIRST3/DNANet_LESPS_coarse.pth.tar SIRST3/ALCNet_full.pth.tar SIRST3/ALCNet_LESPS_centroid.pth.tar SIRST3/ALCNet_LESPS_coarse.pth.tar SIRST3/ACM_full.pth.tar SIRST3/ACM_LESPS_centroid.pth.tar SIRST3/ACM_LESPS_coarse.pth.tar --dataset_names NUAA-SIRST NUDT-SIRST IRSTD-1K

Model Analyses

Analyses of Mapping Degeneration

Fig. 4. IoU and visualize results of mapping degeneration with respect to different characteristics of targets (i.e.,(a) intensity, (b) size, (c) shape, and (d) local background clutter) and point labels (i.e.,(e) numbers and (f) locations). We visualize the zoom-in target regions of input images with GT point labels (i.e., red dots in images) and corresponding CNN predictions (in the epoch reaching maximum IoU).

Analyses of the Label Evolution Framework

Effectiveness

Table 1. Average results achieved by DNAnet with (w/) and without (w/o) LESPS under centroid, coarse point supervision together with full supervision.

Fig. 5. Quantitative and qualitative results of evolved target masks.

Fig. 6. Visualizations of regressed labels during training and network predictions during inference with centroid and coarse point supervision.

Parameters

Fig. 7. PA (P) and IoU (I) results of LESPS with respect to (a) initial evolution epoch, (b) Tb and (c) k of evolution threshold, and (d) evolution frequency.

Comparison Results

Comparison to SISRT Detection Methods

Table 2. Results of different methods. “CNN Full”, “CNN Centroid”, and “CNN Coarse” represent CNN-based methods under full supervision, centroid and coarse point supervision. “+” represents CNN-based methods equipped with LESPS.

Fig. 8. Visual detection results of different methods. Correctly detected targets and false alarms are highlighted by red and orange circles, respectively.

Comparison to Fixed Pseudo Labels

Table 3. Results of DNA-Net trained with pseudo labels generated by input intensity threshold, LCM-based methods and LESPS under centroid and coarse point supervision. Best results are shown in boldface.

Citiation

@article{LESPS,
  author = {Ying, Xinyi and Liu, Li and Wang, Yingqian and Li, Ruojing and Chen, Nuo and Lin, Zaiping and Sheng, Weidong and Zhou, Shilin},
  title = {Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision},
  journal = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2023},
}

Contact

Welcome to raise issues or email to [email protected] for any question.

lesps's People

Contributors

Stargazers

Watchers

Forkers

jie311 whuhxb linaom1214

lesps's Issues

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 197 but got size 196 for tensor number 1 in the list.

你好，在训练过程中，epoch在30-50期间（每一次训练在哪个epoch报错会变动）会报错：
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 197 but got size 196 for tensor number 1 in the list.

报错位置为model_DNANet.py的forward函数的第三行 x0_1 = self.conv0_1(torch.cat([x0_0, self.up(x1_0)], 1))

经过打印，发现报错时x0_0的size为[1, 16, 197, 320]，self.up(x1_0)的size为[1, 32, 196, 320]，请问我应该怎么解决呢？期待你的回复。

测试图像

怎么测试自己的图像，能上传一个demo吗

复现过程中的实验结果相关

作者你好，我正在尝试复现这篇论文，并在IRSTD-1k的数据集上使用DNANet的centroid方式跑出了一个实验结果，但是实验结果令人疑惑：

PD和FA与论文中的结果相近，但IoU结果相差极大，且发现在训练过程中loss在第200个epoch后持续上涨，最终训练loss达到了200以上，希望能解惑！谢谢

在新数据集上训练很慢，和log中记录的训练时长差距大

当读取新的数据集，会先算该数据集的均值方差用于归一化，这个过程比较费时。如下图。

我增加了红框的部分，当读取到新的数据集算完会打印均值方差，你将算完的均值方差增加到蓝框的地方，下一次再训练就不用重新算会直接读取。

Please first download datasets via Baidu Drive (key:1113),

请问百度网盘的提取验证码是不是更新了，1113显示验证码错误

训练自己数据集

作者，您好
mask img——idx代码，作者是否发布

请提供完整的测试数据集

百度链接数据集文件中img_idx没有包含test_NUAA-SIRST.txt，test_NUDT-SIRST.txt，test_IRSTD-1K.txt，能麻烦提供一下吗？

train.py test.py doesnt work as intented

I cloned the repo, prepared the dataset as instructed and test.py and train.py doesn't work as intented. There are many syntax and logic errors in the code provided. The argument structure is almost totally wrong. If the README.md is not going to help us simulate your results and get started with your code what's the point? Is there a proper way to simulate your paper results or am I missing something here?

不知道作者有没有遇到上下采样维度不匹配的问题

非常感谢您杰出的工作
不过我在复现代码的时候，遇到了RuntimeError: Sizes of tensors must match except in dimension 2. Got 210 and 211 (The offending index is 0)
经过debug发现，在维度为2^n的偶数的时候，不会出现问题
否则下采样再上采样在合并的时候就会出现维度不匹配的问题：比如211->105->210
不知道作者在运行的时候有没有遇到这样的问题

ALCNet

作者，你好，选用ALCNet进行test出现如下报错
File "LESPS/model/ACM/model_ALCNet.py", line 182, in forward
[transforms.Resize([hei//16, wid//16])(out),
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 244, in call
return F.resize(img, self.size, self.interpolation)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 319, in resize
raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
TypeError: img should be PIL Image. Got <class 'torch.Tensor'>

请问怎么解决？谢谢

训练epochs数的最优选择

作者你好，有个问题想请教。figure 5中显示了不同情况下IoU随epochs的变化曲线，请问我们该如何确定IoU最大时对应的epoch？

line 198 error

作者你好，在使用train直行到line 198 opt.f = open(opt.save + '/' + opt.dataset_name + '' + opt.model_name + 'LESPS' + opt.label_type + '' + (time.ctime()).replace(' ', '') + '.txt', 'w')，出現了OSError: [Errno 22] Invalid argument: "./log/[[LESPS'centroid'_Tue_Jan_16_09:43:03_2024.txt"，請問這需要如何解決。

Problem with replicating the results

Hi @XinyiYing,

Here are the steps we have done till now to run the code:

We cloned the repository
We trained it on the datasets and obtained various .pth.tar model files in /log4 directory
While running the test we used DNANet_LESPS_centroid_400.pth.tar and DNANet_LESPS_coarse_400.pth.tar as arguments in test.py

However we noticed the results are not consistent with the DNANet_LESPS_centroid.pth.tar and DNANet_LESPS_coarse.pth.tar model files given in /log directory.

Also how to obtain DNANet_full.pth.tar model file present in /log directory during the training phase.

How to replicate the results?

关于函数F.conv2d

F.conv2d(cur_point_mask, weight=torch.ones(1,1,background_length,background_length)　在这个函数为什么要设置它的权重为１，它的作用是什么

关于test.py

test.py里第四十六行和第四十七行的语法是不是出错了

raise TypeError('img should be PIL Image. Got {}'.format(type(img))) TypeError: img should be PIL Image. Got <class 'torch.Tensor'>

您好，在复现 ALCNet-sirst3-test.py时出现以下报错，请问怎么解决
raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
TypeError: img should be PIL Image. Got <class 'torch.Tensor'>

关于实验部分的映射退化分析

你好，根据论文，我想进行实验部分的映射退化分析，探索目标强度、目标尺寸、点标签的位置、标签中的点数对映射退化的影响。请问我该怎么做？期待您的回复。

full_path

full_path.dir 怎么生成？
自己数据集

关于损失函数

您好，我想询问下文中的损失函数的公式是什么可以么

使用命令运行报错

已修复该问题，可以通过如下命令运行代码
python train.py --model_names DNANet ALCNet ACM --dataset_names SIRST3 --label_type 'centroid'
python train.py --model_names DNANet ALCNet ACM --dataset_names SIRST3 --label_type 'coarse'
python test.py --model_names DNANet ALCNet ACM --pth_dirs None --dataset_names NUAA-SIRST NUDT-SIRST IRSTD-1K
python test.py --model_names DNANet ALCNet ACM --pth_dirs SIRST3/DNANet_full.pth.tar SIRST3/DNANet_LESPS_centroid.pth.tar SIRST3/DNANet_LESPS_coarse.pth.tar SIRST3/ALCNet_full.pth.tar SIRST3/ALCNet_LESPS_centroid.pth.tar SIRST3/ALCNet_LESPS_coarse.pth.tar SIRST3/ACM_full.pth.tar SIRST3/ACM_LESPS_centroid.pth.tar SIRST3/ACM_LESPS_coarse.pth.tar --dataset_names NUAA-SIRST NUDT-SIRST IRSTD-1K

About the mask update process

How are the "background_length=33" and "target_length=3" determined during the mask update process?