caoyunkang / segment-any-anomaly Goto Github PK

Official implementation of "Segment Any Anomaly without Training via Hybrid Prompt Regularization (SAA+)".

Python 3.64% C++ 0.05% Cuda 0.47% Shell 0.01% Jupyter Notebook 95.83%

segment-any-anomaly's Introduction

Segment Any Anomaly

This repository contains the official implementation of Segment Any Anomaly without Training via Hybrid Prompt Regularization, SAA+.

SAA+ aims to segment any anomaly without the need for training. We achieve this by adapting existing foundation models, namely Grounding DINO and Segment Anything, with hybrid prompt regularization.

🔥What's New

We have added a Huggingface demo. Enjoy it~
We have updated the colab demo. Enjoy it~
We have updated this repository for SAA+.
We have published Segment Any Anomaly without Training via Hybrid Prompt Regularization, SAA+.

💎Framework

We found that a simple assembly of foundation models suffers from severe language ambiguity. Therefore, we introduce hybrid prompts derived from domain expert knowledge and target image context to alleviate the language ambiguity. The framework is illustrated below:

Quick Start

🏦Dataset Preparation

We evaluate SAA+ on four public datasets: MVTec-AD, VisA, KSDD2, and MTD. Additionally, SAA+ was a winning team in the VAND workshop, which offers a specified dataset, VisA-Challenge. To prepare the datasets, please follow the instructions below:

By default, we save the data in the ../datasets directory.

cd $ProjectRoot # e.g., /home/SAA
cd ..
mkdir datasets
cd datasets

Then, follow the corresponding instructions to prepare individual datasets:

🔨Environment Setup

You can use our script for one-click setup of the environment and downloading the checkpoints.

cd $ProjectRoot
bash install.sh

📄Repeat the public results

MVTec-AD

python run_MVTec.py

VisA-Public

python run_VisA_public.py

VisA-Challenge

python run_VAND_workshop.py

The submission files can be found in ./result_VAND_workshop/visa_challenge-k-0/0shot.

KSDD2

python run_KSDD2.py

MTD

python run_MTD.py

📄Demo Results

Run following command for demo results

python demo.py

🎯Performance

🔨 Todo List

We have planned the following features to be added in the near future:

Update repository for SAA+
Detail the zero-shot anomaly detection framework.
Evaluate on other image anomaly detection datasets.
Add UI for easy evaluation.
Update Colab demo.
HuggingFace demo.

💘 Acknowledgements

Our work is largely inspired by the following projects. Thanks for their admiring contribution.

Stargazers over time

Citation

If you find this project helpful for your research, please consider citing the following BibTeX entry.

@article{cao_segment_2023,
	title = {Segment Any Anomaly without Training via Hybrid Prompt Regularization},
	url = {http://arxiv.org/abs/2305.10724},
	number = {{arXiv}:2305.10724},
	publisher = {{arXiv}},
	author = {Cao, Yunkang and Xu, Xiaohao and Sun, Chen and Cheng, Yuqi and Du, Zongwei and Gao, Liang and Shen, Weiming},
	urldate = {2023-05-19},
	date = {2023-05-18},
	langid = {english},
	eprinttype = {arxiv},
	eprint = {2305.10724 [cs]},
	keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence},
}

@article{kirillov2023segany,
  title={Segment Anything}, 
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}

@inproceedings{ShilongLiu2023GroundingDM,
  title={Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection},
  author={Shilong Liu and Zhaoyang Zeng and Tianhe Ren and Feng Li and Hao Zhang and Jie Yang and Chunyuan Li and Jianwei Yang and Hang Su and Jun Zhu and Lei Zhang},
  year={2023}
}

segment-any-anomaly's People

Contributors

Stargazers

Watchers

segment-any-anomaly's Issues

Few shot training with SAA+

Hello,

I've been reading the paper on the model and there isn't much mention of Few Shot Learning in it. However, in the code there's an argument k_shot, is this referring to few shot learning? if so could you shed some light on how it is implemented in this context? If not how it might be done it SAA+?

use for image similarity task?

how can i use this to detect anomalies between two images?

like input two images and then detect anomalies for image similarity :O

你好，可以分享一下这片论文的代码吗《Dual-path Frequency Discriminators for Few-shot Anomaly Detection》

//k-shot// does not seem to take effect in the code

I try to evaluate few shot shot detection, but I can't find the part training the model in eval_SAA.py, it seems that k-shot parameter doesn't work.

How to support unseen types of defections?

by observing the prompts, I found SAA/SAA+ are quite limited to known types of defections.

Here is an example. SAA/SAA+ has no concept of "stripped screws".

image_path = 'assets/screws.png'
textual_prompts = ['dirty. stripped. spot. ', 'screw'] # detect prompts, filtered phrase
property_text_prompts = 'the image of screw have 4 similar screw, with a maximum of 2 anomaly. The anomaly would not exceed 1. object area. '

Inquiries about hugging face reporting errors while using the demo process

It's an exciting job. As shown, I wanted to run the author's demo on hugging face, but unfortunately I failed, expecting it to be resolved.

What is the meaning of "k_shot" and "experiment_indx" in mvtec.py?

assert k_shot in [0, 1, 5, 10]
assert experiment_indx in [0, 1, 2]

huggingface_hub.utils._errors.LocalEntryNotFoundError

When I python run_VisA_public.py,it appears that huggingface_hub.utils._errors.LocalEntryNotFoundError: Connection error, and we cannot find the requested files in the disk cache. Please try again or make sure your Internet connection is on.

error in property_prompts of MTD

property_prompts in mtd_parameters.py has the key of 'ksdd2', which is another dataset.

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

Best Wishes,

Qiao

could realease you training code?

About Train or Finetune

Does this SAA+ need to train or fine-tuning for segmenting any anomaly? What should I do if necessary? If not, why?

What is the difference between the results of '_SAA_plus.jpg' and '_Saliency.jpg'?

After the validation mode of SAA+, it can save the results images of '_SAA_plus.jpg' and '_Saliency.jpg', but what is the difference between them? I feel so confused. Thanks.

关于抛洒物的检测。

请问一下，能否对道路的抛洒物进行检测，也就是高速道路上的垃圾检测。

PermissionError: [Errno 13] Permission denied: 'C:\\Users\\xxx\\AppData\\Local\\Temp\\tmpwhdfb3ew\\tmpcnopyh0n.py'

PermissionError: [Errno 13] Permission denied: 'C:\Users\xxx\AppData\Local\Temp\tmpwhdfb3ew\tmpcnopyh0n.py'
曹博，请问您遇到过这个bug嘛，该怎么解决呢，期待您的答复~

Prompt Engineering Suggestion

I am trying to use grounding-Dino to solve fabric anomaly-detection problem, except 'defect', do you have any other prompt recommendation

How to process multiple classes prompt at the same time?

Hi, thanks for your great work! Have you ever tried to use multiple classes prompt like 'the defect on the cable. the flaw on the cable.'? The results looks weird...

Finetuning code on custom dataset.

Hello @caoyunkang , Thanks for amazing work. Can you provide the demo along with the tutorial blog on fine-tuning SAA+ on custom dataset for custom requirements.

The error of run demo.py

Thank you so much for your wonderful paper! I have really enjoyed it!
But when I run:python demo.py and print(score),print(similarty map),it appears the following error and doesn't show the picture:

/share/home/project/Segment-Any-Anomaly/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
/share/home/anaconda3/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
build_sam_vit_h
wide_resnet50_2
/share/home/anaconda3/lib/python3.10/site-packages/transformers/modeling_utils.py:763: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/share/home/anaconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
score is :[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]

similarty map is :[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]

It seems that score, appendix = model(image) has some problems .
Thank you very much for your time and expertise. I greatly appreciate any insights or suggestions

where is '_C'?

About classification

Thank you so much for your wonderful paper! I have really enjoyed it!
I have a question regarding the SSA,

What is the model's classification result for abnormal/non-abnormal in the entire picture?
What is the cross-domain effect of classification?
I noticed that adding a classification loss can lead to overfitting and non-convergence during fine-tuning. Could you please share any approaches you have used to address this problem?
Thank you very much for your time and expertise. I greatly appreciate any insights or suggestions

custom dataset

How to implement a repo on custom datasets to get segmentation for the image by image?
How to Finetune using a custom dataset?

Reproduce results on MVTec-AD

Thank you for your great work! I try to reproduce the results on MVTec-AD following the instructions in README, but the pixel F1 score on MVTec-AD is 37.65 rather than 39.40 which is reported in the paper. Do I need to tune some hyper-parameters to obtain the same result? Thank you!

FileNotFoundError: [WinError 3] 系统找不到指定的路径。: '../datasets/VisA_pytorch/1cls\\pipe_fryum\\train'

The problem is that there is data floder in the pipe_fryum but not train.

Repeat the results on public datasets like MVTec, VisA etc.

Hi, @caoyunkang
I am very interested in repeating SOTA in benchmark datasets. Can you release the code on prompt engineering and post-processing described in your paper?

Cheers
Teng

We couldn't connect to 'https://huggingface.co' to load this file

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.这个问题怎么解决

Domain Knowledge

Hey! You've mentioned quite a few times "domain expert knowledge" in the paper, but you never mentioned anything about where exactly it came from, or how you filtered down to these prompts. Would you mind shedding some light onto this part?

Failed to load custom C++ ops. Running on CPU mode Only!

曹博你好，按照您的的安装环境安装到linux后出现了Failed to load custom C++ ops. Running on CPU mode Only! 这个问题，以及requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) 我看网上没有有关的解决方案，请问您是否也遇到了这个问题呢？如果没有遇到，您能说一下您的硬件环境吗？