mala-lab / inctrl Goto Github PK

Official implementation of CVPR'24 paper 'Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts'.

Python 100.00%

anomaly-detection few-shot-anomaly-detection generalist-model vision-language-model foundation-models image-anomaly-detection

inctrl's Introduction

InCTRL (CVPR 2024)

Official PyTorch implementation of "Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts".

Overview

In this work, we propose to train a Generalist Anomaly Detection (GAD) model with few-shot normal images as sample prompts for AD on diverse datasets on the fly. To this end, we introduce a novel approach that learns an incontext residual learning model for GAD, termed InCTRL. It is trained on an auxiliary dataset to discriminate anomalies from normal samples based on a holistic evaluation of the residuals between query images and few-shot normal sample prompts. Regardless of the datasets, per definition of anomaly, larger residuals are expected for anomalies than normal samples, thereby enabling InCTRL to generalize across different domains without further training. Comprehensive experiments on nine AD datasets are performed to establish a GAD benchmark that encapsulate the detection of industrial defect anomalies, medical anomalies, and semantic anomalies in both one-vs-all and multi-class setting, on which InCTRL is the best performer and significantly outperforms state-of-the-art competing methods.

Setup

python >= 3.10.11
torch >= 1.13.0
torchvision >= 0.14.0
scipy >= 1.10.1
scikit-image >= 0.21.0
numpy >= 1.24.3
tqdm >= 4.64.0

Device

Single NVIDIA GeForce RTX 3090

Run

Step 1. Download the Anomaly Detection Dataset(ELPV, SDD, AITEX, VisA, MVTec AD, BrainMRI, HeadCT, MNIST, CIFAR-10) and Convert it to MVTec AD Format(the convert script).

The dataset folder structure should look like:

DATA_PATH/
    subset_1/
        train/
            good/
        test/
            good/
            defect_class_1/
            defect_class_2/
            defect_class_3/
            ...
    ...

Step 2. Generate Training/Test Json Files of Anomaly Detection Datasets(the generate script).

The json folder structure should look like:

JSON_PATH/
    dataset_1/
        subset_1/
            subset_1_train_normal.json
            subset_1_train_outlier.json
            subset_1_val_normal.json
            subset_1_val_outlier.json
        subset_2/
        subset_3/
        ...
    ...

Step 3. Download the Few-shot Normal Samples for Inference on Google Drive

Step 4. Download the Pre-train Models on Google Drive

Step 5. Quick Start

Change the TEST.CHECKPOINT_FILE_PATH in config to the path of pre-train model and run

python test.py --val_normal_json_path $normal-json-files-for-testing --val_outlier_json_path $abnormal-json-files-for-testing --category $dataset-class-name --few_shot_dir $path-to-few-shot-samples

For example, if run on the category candle of visa with k=2:

python test.py --val_normal_json_path /AD_json/visa/candle_val_normal.json --val_outlier_json_path /AD_json/visa/candle_val_outlier.json --category candle --few_shot_dir /fs_samples/visa/2/

Training

python main.py --normal_json_path $normal-json-files-for-training --outlier_json_path $abnormal-json-files-for-training --val_normal_json_path $normal-json-files-for-testing --val_outlier_json_path $abnormal-json-files-for-testing

Implementation of WinCLIP

WinCLIP is one main competing method to ours, but its official implentation is not publicly available. We have successfully reproduced the results of WinCLIP based on our extensive communications with its authors and used our implementation to perform experiments in the paper. Our implementation has been released at WinCLIP.

Citation

@inproceedings{zhu2024toward,
  title={Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts},
  author={Zhu, Jiawen and Pang, Guansong},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  year={2024}
}

inctrl's People

Contributors

Stargazers

Watchers

Forkers

guansongpang sy00n thanhpham1987 lihuibng yrt1026 yieldslabs noploop genff zhangxujinsh abylikhsanov buloseshi riziuzi

inctrl's Issues

Hi, you can use _torch.save()_ to generate .pt file for your own few-shot samples.

          Hi, you can use **_torch.save()_** to generate .pt file for your own few-shot samples.

Originally posted by @Diana1026 in #7 (comment)
Thank you for this work.I want to test my own model,but I don't know exactly where should I use torch.save() to generate my own .pt file. And I don't know the structure of .pt file. Could you please give a more specific explaination? Thanks a lot.

could you share how to visualize the segment result when inference?

how to visualize the segment result by heatmap or by mask output

lower performance on 2-shot visa-candle with the default setting (e.g., pre-trained model and few-shot prompts)

For 2-shot candle, I only got AU_ROC=0.8744, AU_PR=0.8715, with the default setting. This is lower than the paper's reported result (0.916±0.006, 0.920±0.008).

The performance of using 4-shot or 8-shot on the Visa dataset is similar to that of 2-shot

Hello, I validated the 8-shot performance using the provided pre-trained model and few shot samples, and the results were similar to 2-shot, not as high as mentioned in the paper. I did the following: (1) checkpoints/8/checkpoint.pyth modified TEST CHECKPOINT-FILE-PATH (2) changed/fs_samples/visa/2/in the provided test command to/fs_samples/visa/8/， results are as follows:

Did I miss any operational steps？

Can you provide the download address for vit'b_16_plus_240 laion400m_e32-699c4b84.pt

After I use the convert_visa.py to convert Visa, I found that I can't generate the json file through the gen_train_json.py, could you have a check at the convert_visa.py? thanks a lot

Is the model trained on the full dataset of MVTec available?

I understand that the available model is pre-trained on MVTec. Could you make your model pre-trained on the full dataset of ViSA available?

Best Regards,

License

Hi InCTRl team, @Diana1026 and @GuansongPang,

I read the paper with great interest and would like to investigate your model and possibly put it to use. However, there is no license for the code in this repository.

Do you intend to specify a license? And if so, when would you do it?

Thank you in advance for your work and best regards!

reproduce the code

Hello, could you please publish the training and testing process in detail, as well as the organization of the files and the associated json files, it's really not very good to reproduce the code

lower performance for Visa dataset validation

hello, i download the provided model and few-shot normal samples as you said in Readme, and i just test the candle datas from visa testset with 2 shot, and the tesetset is spilit by "1cls.csv", the result i get is AUC-ROC: 0.8773, AUC-PR: 0.8693; it's obviously lower than the results describled in paper: AUROC-0.916, AUPRC: 0.920.

I can not find out where the problem is, so could you give me some suggesionts for check?

the training process

Hello, your paper has inspired me a lot, and I would like to reproduce the code. So, I would like to ask whether executing python main.py --normal_json_path $normal-json-files-for-training --outlier_json_path $abnormal-json-files-for-training --val_normal_json_path $normal-json-files-for-testing --val_outlier_json_path $abnormal-json-files-for-testing during the training process requires running one JSON file for each category in each dataset?

Can you provide the test set about Visa, ELPV, SDD, AITEX?

I am concerned that the few normal images you provided may overlap with the test set I have personally split.

Few-shot Normal Images for Inference.

@Diana1026
Many thanks for your awesome work. I am currently following the abnormal detection protocol you defined.

However, per the public link on your GitHub page, it seems like you only released .pt files for few-shot normal samples, which do not have information (e.g., image names) to show which specific samples from each dataset were used to create these .pt files. In addition, the few-shot normal sample folder might not show .pt files for all $9$ datasets reported in your work. I would greatly appreciate it if you could share names of these few-shot samples used in the inference.

I was trying to contact you via email, but my Outlook email app says [email protected] is not a valid email address.

Can you provide code for visual inspection

Thank you for your work. Can you provide the code for visual detection

How is x.pt file generated with the extension python test.py --few_shot_dir

Thank you for doing a great job. I have a question here: First, I use my own few-shot normal sample training to verify defect detection. I need x.pt format file for the parameter --few_shot_dir in Python test.py. I don't know how to convert a normal sample to. pt?

few_shot_path = os.path.join(cfg.few_shot_dir, cfg.category+".pt")
normal_list = torch.load(few_shot_path)

Please give me help. thanks.

Guidance on Training and Testing with Custom Dataset Similar to MVTec Format

Hello,

I am currently working on a project where I need to train and test a model using my custom dataset, which is structured similarly to the MVTec dataset format. I've been trying to adapt the workflow and methodologies used for the MVTec dataset to fit my dataset's requirements but have encountered some challenges, particularly in generating the custom_dataset.pt file.

Could anyone provide some insights or a step-by-step guide on how to:

Adapt the existing training and testing pipeline for a custom dataset that aligns with the MVTec format? Are there specific parameters or configurations that need to be adjusted in the code to accommodate the differences in the dataset?

Generate the few_shot.pt file for my dataset. What is the process or script used to create this file from the dataset? Are there specific requirements for the dataset structure or format to successfully generate this file?

For context, my dataset contains images and annotations that mirror the structure used in the MVTec dataset, including similar categories and anomaly types. My goal is to leverage the existing frameworks and tools used for MVTec to achieve comparable performance on my dataset.

I appreciate any advice, scripts, or documentation that could help me navigate these challenges. Thank you in advance for your time and assistance.

Best regards,

from binary_focal_loss import BinaryFocalLoss, No module

where's the loss definition

from binary_focal_loss import BinaryFocalLoss

the step 2 google drive

hello，the Step 2 Download the Few shot Normal Samples for Inference on [Google Drive]，Where can I get the link？thanks