arimousa / ddad Goto Github PK

License: MIT License

Python 100.00%

ddad's Issues

results資料集是要自己創嗎
因為我跑完我找不到結果圖
Class: toothbrush w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=4
Detecting Anomalies...
AUROC: (88.9,83.4)
/home/anywhere3090l/Desktop/henry/DDAD-main/metrics.py:96: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
df = pd.concat([df, pd.DataFrame({"pro": mean(pros), "fpr": fpr, "threshold": th}, index=[0])], ignore_index=True)
PRO: 48.6

How to set w_DA in Domain Adaptation?

I am training models on MVTec AD myself.
How to set w_DA in fine tuning feature extractor?
Should I fix w_DA to be 3 for all of the categories or set w_DA=w with w being best setting for each category?

The result cannot be achieved using official checkpoint.

Hello, I use the official checkpoint 1750 of cashew in VisA, but I cannot get the same results in the paper. Is there something wrong in my experiment? Below is my experimental setup.

Looking forward to your outstanding work

Finetuning hyperparameter w

Hello, thank you for your excellent work! I wanted to ask how you got values for hyperparameter w for MVTec AD and VisA. I was not able to find it in the paper, sorry if I have missed something.

我的檔案路徑可以用window寫法嗎

data:
DA_batch_size: 30
batch_size: 30
category: grid
data_dir: /mnt/d/download2/aaa/Dynamic-noise-AD-master/MVTec/
image_size: 256
imput_channel: 4
manualseed: -1
mask: true
name: MVTec
metrics:
image_level_AUROC: true
image_level_F1Score: true
pixel_level_AUROC: true
pixel_level_F1Score: true
pro: true
threshold:
manual_image: null
manual_pixel: null
method: adaptive
model:
DA_epochs: 1 # nr. of fine tune epochs for fe
DA_fine_tune: 1
DA_learning_rate: 1e-4
DA_rnd_step: true # pick noising level for DA according to uniform distribution
dynamic_steps: true # Dynamic implicit conditioning
KNN_metric: l1
anomap_excluded_layers: # excluded feature layers for anomaly map creation

0
anomap_weighting: 0.85 # weight for latent anomaly map
attn_reso:
32
16
8
4
beta_end: 0.0195
beta_start: 0.0015
channel_mults:
1
2
2
4
4
checkpoint_dir: /mnt/d/download2/aaa/Dynamic-noise-AD-master/checkpoints/MVTec/
checkpoint_epochs: 300
checkpoint_name: weights
consistency_decoder: 0 # consistency decoder for better image quality at the cost of additional runtime
device: cuda
distance_metric_eval: combined
downscale_first: 1 # noiseless scaling
ema: true
ema_rate: 0.999
epochs: 9
eta: 0 # 0 corresponds to DDIM sampling and 1 to DDPM
eta2: 4 # DDAD conditioning
exp_name: default
fe_backbone: resnet34
head_channel: -1
knn_k: 20
latent: true
latent_backbone: VAE
latent_size: 32
learning_rate: 1e-4
multi_gpu: false
n_head: 8
noise: Gaussian
noise_sampling: 0 # noise image or not
num_workers: 30
optimizer: AdamW
save_model: true
schedule: linear
seed: 42
selected_features: # selected layer for KNN search
1
skip: 8 # steps to skip during inference
skip_DA: 8 # steps to skip during domain adaptation
test_trajectoy_steps: 80 # maximum noising level
test_trajectoy_steps_DA: 80 # maximum noising level for domain adaptation
trajectory_steps: 1000
unet_channel: 192
visual_all: true # additional visual output of heatmaps
weight_decay: 0.01
以上是我的config包

但都一直偵測不到
(AI) PS D:\download2\aaa\Dynamic-noise-AD-master> python main.py
2024-07-15 10:15:07.106301: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-07-15 10:15:07.624565: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
Num params: 281088004
Current device is cuda
Traceback (most recent call last):
File "D:\download2\aaa\Dynamic-noise-AD-master\main.py", line 158, in
execute_main_test()
File "D:\download2\aaa\Dynamic-noise-AD-master\main.py", line 154, in execute_main_test
train(args)
File "D:\download2\aaa\Dynamic-noise-AD-master\main.py", line 74, in train
trainer(unet, constants_dict, ema_helper, config)
File "D:\download2\aaa\Dynamic-noise-AD-master\train.py", line 31, in trainer
trainloader = torch.utils.data.DataLoader(
File "C:\Users\user.conda\envs\AI\lib\site-packages\torch\utils\data\dataloader.py", line 350, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "C:\Users\user.conda\envs\AI\lib\site-packages\torch\utils\data\sampler.py", line 143, in init
raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}")
ValueError: num_samples should be a positive integer value, but got num_samples=0

best settings for each category

I am hoping you to publish the best settings for each category,thank you

Question about the precision

I noticed that in UNetModel.forward(), the data type is explicitly set to be torch.float32.
Is there any reason to do this?
Does it mean that float32 is necessary for UNet and mixed precision is not suitable to the model?

Regarding the Execution Results of MVTecAD Bottle Evaluation.

Thank you for sharing such a wonderful piece of work.

I tried executing the evaluation after training MVTecAD bottle on Colab Pro+, and the following log was output:

Sample :  0  predicted as:  0  label is:  1 
Sample :  10  predicted as:  0  label is:  1 
Sample :  15  predicted as:  0  label is:  1 
Sample :  49  predicted as:  0  label is:  1 
Sample :  50  predicted as:  0  label is:  1 
Sample :  51  predicted as:  0  label is:  1 
Sample :  52  predicted as:  0  label is:  1 
Sample :  53  predicted as:  0  label is:  1 
Sample :  55  predicted as:  0  label is:  1 
Sample :  58  predicted as:  0  label is:  1 
Sample :  59  predicted as:  0  label is:  1 
Sample :  60  predicted as:  0  label is:  1 
Sample :  61  predicted as:  0  label is:  1 
Sample :  62  predicted as:  0  label is:  1 

AUROC: 1.0
AUROC pixel level: 0.9292386174201965 
PRO: 0.7771466867947617
threshold:  0.15196724

The AUROC pixel level result is lower than the value mentioned in the Readme. Upon checking the images in the results, 13 out of 83 seemed to be misclassified.

If there are any specific points or precautions to consider for reproduction, please let me know.

I can not reproduce your results by using the uploaded checkpoints

I am trying to reproduce the results by using the uploaded checkpoints on hazelnut and screw and can not get the same scores.

For hazelnut the output is as follows:
Class: hazelnut w: 5 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Detecting Anomalies...
AUROC: (99.4,98.3)
PRO: 87.1

For screw, it is:
Class: screw w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Detecting Anomalies...
AUROC: (96.5,99.3)
PRO: 96.5

It seems that Pixel_AUROC and PRO are similar to your results, but Image_AUROC is very different.
Is there something wrong?
I just ran the code by using the checkpoints. My Python is 3.10 and torch is 2.0.1.

Question about the initilization of "ch" in UNetModel.init()

I am reading the code about UNetModdel in unet.py.
Lines 278-281 in UNetModel.__init__():

ch = int(channel_mults[0] * base_channels)
self.down = nn.ModuleList(
[TimestepEmbedSequential(nn.Conv2d(self.in_channels, base_channels, 3, padding=1))]
)

That means the number of output channels of the first element in self.down should be "base_channels".
On the other hand, according to lines 288-293, this number should be "ch".
Of course, if channel_mults[0] is 1, it would not cause any probleｍ. But for img_size being 512, channel_mults[0] is 0.5.
I have not run the code to test it. However, I think something might be wrong here.
Shoud be Line 278 modified to be

ch = int(base_channels)

or channel_mults[0] should be set to be 1 for all of img_size?

visa資料集在eval那段錯誤

我只有改config與dataset.py
config:
data :
name: VisA_dataset #MVTec #MTD #VisA
data_dir: /home/anywhere3090l/Desktop/henry/ddadvisa/VisA_dataset #MVTec #VisA #MTD
category: pcb4 #['carpet', 'bottle', 'hazelnut', 'leather', 'cable', 'capsule', 'grid', 'pill', 'transistor', 'metal_nut', 'screw','toothbrush', 'zipper', 'tile', 'wood']
# ['candle', 'capsules', 'cashew', 'chewinggum', 'fryum', 'macaroni1', 'macaroni2', 'pcb1', 'pcb2' ,'pcb3', 'pcb4', 'pipe_fryum']
image_size: 256
batch_size: 8 # 32 for DDAD and 16 for DDADS
DA_batch_size: 32 #16 for MVTec and [macaroni2, pcb1] in VisA, and 32 for other categories in VisA
test_batch_size: 32 #16 for MVTec, 32 for VisA
mask : True
imput_channel : 3

model:
DDADS: False
checkpoint_dir: /home/anywhere3090l/Desktop/henry/ddadvisa/checkpoints/VisA #MTD #MVTec #VisA
checkpoint_name: weights
exp_name: default
feature_extractor: wide_resnet101_2 #wide_resnet101_2 # wide_resnet50_2 #resnet50
learning_rate: 3e-4
weight_decay: 0.05
epochs: 1
load_chp : 750 # From this epoch checkpoint will be loaded. Every 250 epochs a checkpoint is saved. Try to load 750 or 1000 epochs for Visa and 1000-1500-2000 for MVTec.
DA_epochs: 3 # Number of epochs for Domain adaptation.
DA_chp: 8
v : 7 #7 # 1 for MVTec and cashew in VisA, and 7 for VisA (1.5 for cashew). Control parameter for pixel-wise and feature-wise comparison. v * D_p + D_f
w : 8 # Conditionig parameter. The higher the value, the more the model is conditioned on the target image. "Fine tuninig this parameter results in better performance".
w_DA : 3 #3 # Conditionig parameter for domain adaptation. The higher the value, the more the model is conditioned on the target image.
DLlambda : 0.01 # 0.1 for MVTec and 0.01 for VisA
trajectory_steps: 1000
test_trajectoy_steps: 250 # Starting point for denoining trajectory.
test_trajectoy_steps_DA: 250 # Starting point for denoining trajectory for domain adaptation.
skip : 25 # Number of steps to skip for denoising trajectory.
skip_DA : 25
eta : 1 # Stochasticity parameter for denoising process.
beta_start : 0.0001
beta_end : 0.02
device: 'cuda' #<"cpu", "gpu", "tpu", "ipu">
save_model: True
num_workers : 2
seed : 42

metrics:
auroc: True
pro: True
misclassifications: False
visualisation: False
dataset:
import os
from glob import glob
from pathlib import Path
import shutil
import numpy as np
import csv
import torch
import torch.utils.data
from PIL import Image
from torchvision import transforms
import torch.nn.functional as F
import torchvision.datasets as datasets
from torchvision.datasets import CIFAR10

class Dataset_maker(torch.utils.data.Dataset):
def init(self, root, category, config, is_train=True):
self.image_transform = transforms.Compose(
[
transforms.Resize((config.data.image_size, config.data.image_size)),
transforms.ToTensor(), # Scales data into [0,1]
transforms.Lambda(lambda t: (t * 2) - 1) # Scale between [-1, 1]
]
)
self.config = config
self.mask_transform = transforms.Compose(
[
transforms.Resize((config.data.image_size, config.data.image_size)),
transforms.ToTensor(), # Scales data into [0,1]
]
)
if is_train:
if category:
self.image_files = glob(
os.path.join(root, category, "train", "good", ".JPG")
)
else:
self.image_files = glob(
os.path.join(root, "train", "good", ".JPG")
)
else:
if category:
self.image_files = glob(os.path.join(root, category, "test", "", ".png"))
else:
self.image_files = glob(os.path.join(root, "test", "", ".png"))
self.is_train = is_train

def __getitem__(self, index):
    image_file = self.image_files[index]
    image = Image.open(image_file)
    image = self.image_transform(image)
    if(image.shape[0] == 1):
        image = image.expand(3, self.config.data.image_size, self.config.data.image_size)
    if self.is_train:
        label = 'good'
        return image, label
    else:
        if self.config.data.mask:
            if os.path.dirname(image_file).endswith("good"):
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'good'
            else :
                if self.config.data.name == 'MVTec':
                    target = Image.open(
                        image_file.replace("/test/", "/ground_truth/").replace(
                            ".png", "_mask.png"
                        )
                    )
                else:
                    target = Image.open(
                        image_file.replace("/test/", "/ground_truth/"))
                target = self.mask_transform(target)
                label = 'defective'
        else:
            if os.path.dirname(image_file).endswith("good"):
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'good'
            else :
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'defective'
            
        return image, target, label

def __len__(self):
    return len(self.image_files)

(base) anywhere3090l@3090l:~/Desktop/henry/ddadvisa$ python main.py --eval True
Evaluating...
++++++++++testloader++++++++++ <torch.utils.data.dataloader.DataLoader object at 0x7fb37972da50>
++++++++++test_dataset++++++++++ <dataset.Dataset_maker object at 0x7fb37ab03b90>
Traceback (most recent call last):
File "/home/anywhere3090l/Desktop/henry/ddadvisa/main.py", line 90, in
test(args)
File "/home/anywhere3090l/Desktop/henry/ddadvisa/main.py", line 36, in test
evaluate(unet, config)
File "/home/anywhere3090l/Desktop/henry/ddadvisa/test.py", line 82, in evaluate
threshold = metric(labels_list, predictions, anomaly_map_list, gt_list, config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anywhere3090l/Desktop/henry/ddadvisa/metrics.py", line 18, in metric
pro = compute_pro(gt_list, anomaly_map_list, num_th = 200)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anywhere3090l/Desktop/henry/ddadvisa/metrics.py", line 75, in compute_pro
results_embeddings = amaps[1]
~~~~~^^^
IndexError: list index out of range

img_size = 512 and change channel will report an error

Finetuning & result

Hi. I really enjoyed your great idea to use the diffusion model & domain adaptation to anomaly detection while reading your paper. I have followed the sample with MVTec dataset but it just gave me not a great score in the paper. (AUROC: (86.0,96.3),90.8)

I changed 'batch_size' parameter to 16 and did not change any parameters with the code. Is it just a normal result or not? Can you guide me on how to get a better result?

我用UBUNTU

但執行Python3 main.py --domain_adaptation True會閃退

可以用3d數據集嗎

可以用3d數據集嗎
想試看看

detection issue using the checkpoints file you provided

Hello, I encountered the following issue while using the checkpoints file you provided for testing. I'm not sure if you can point out the mistake or provide a solution to the problem. Thank you very much

Class: leather w: 8 v: 1 load_chp: 2000/data.pkl feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Detecting Anomalies...
Traceback (most recent call last):
File "main.py", line 96, in
detection(config)
File "main.py", line 36, in detection
checkpoint = torch.load(os.path.join(os.getcwd(), config.model.checkpoint_dir, config.data.category, str(config.model.load_chp)))
File "/home/dell/anaconda3/envs/diffusion2.0/lib/python3.8/site-packages/torch/serialization.py", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/dell/anaconda3/envs/diffusion2.0/lib/python3.8/site-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

我可以用自己的數據集嗎

我想用這篇論文做我的數據集
但我現在只有正樣本
幾個異常照片還有ground truth
把它包成跟mvtec dataset 一樣
這樣子是可行的嗎
"C:\Users\user\OneDrive\桌面\Dynamic-noise-AD-master\MVTec\capsule\train\good"
"C:\Users\user\OneDrive\桌面\Dynamic-noise-AD-master\MVTec\capsule\test\good"
"C:\Users\user\OneDrive\桌面\Dynamic-noise-AD-master\MVTec\capsule\test\poke"
"C:\Users\user\OneDrive\桌面\Dynamic-noise-AD-master\MVTec\capsule\ground_truth"

mvtec跟capsule我是故意命名跟mvtec一樣的

The settings of model and the best results of mvtec

Could you provide the settings of best results for mvtec. I try many hyperparameters but there is some gap on results of some class in mvtec between my results and results in the paper

Data read error during testing

FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\Code\Anomaly_Detection\DDAD-main\dataset.py", line 67, in getitem
target = Image.open(
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\PIL\Image.py", line 3227, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/MVTec\bottle\test\broken_large\000_mask.png'

When I test mine, the above error occurs. There is no 000_mask.png under MVTec\bottle\\test\\broken_large\\path.

為啥eval那邊沒反應

(base) anywhere3090l@3090l:/Desktop/henry/ddadvisa$ python main.py --train True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Training...
Num params: 32952707
/home/anywhere3090l/Desktop/henry/ddadvisa/VisA
Epoch 0 | Loss: 196538.21875
(base) anywhere3090l@3090l:/Desktop/henry/ddadvisa$ python main.py --domain_adaptation True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Domain Adaptation...
/home/anywhere3090l/Desktop/henry/ddadvisa/VisA
Epoch 0 | Loss: 0.02777327597141266
(base) anywhere3090l@3090l:~/Desktop/henry/ddadvisa$ python main.py --eval True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16

train

Hello, when I train a class only 250.... 3000 weight file, but you provided feat0 etc, how did they come about?

problem about load checkpoints

Very nice work. But something went wrong when I loaded the provided checkpoint to evaluate and test the model.

Class: hazelnut w: 8 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Detecting Anomalies...
Traceback (most recent call last):
File "D:\code\DDAD-main\main.py", line 96, in
detection(config)
File "D:\code\DDAD-main\main.py", line 36, in detection
checkpoint = torch.load(os.path.join(os.getcwd(), config.model.checkpoint_dir, config.data.category, str(config.model.load_chp)))
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\serialization.py", line 271, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\serialization.py", line 252, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'D:\code\DDAD-main\checkpoints/MVTec\hazelnut\2000'

I made sure the path to the file was correct. I noticed the downloaded checkpoint is a .zip fold, such as checkpoints/MVTec/hazelnut/2000.zip, is it right? I tried to extract the zip file, and it is also not right. how should load the checkpoint?

The checkpoint is like this:

My python is 3.10 and pytorch is 2.0. And when I execute python main.py --eval True, there is not a --eval in args, so I change it into python main.py --detection True

Invalid load key

Hi,
I'm trying to run the code for fine tuning using the command in the Readme.md but I have the following error:

Traceback (most recent call last):
File "/content/DDAD/main.py", line 91, in
finetuning(config)
File "/content/DDAD/main.py", line 46, in finetuning
checkpoint = torch.load(os.path.join(os.getcwd(), config.model.checkpoint_dir, config.data.category, str(config.model.load_chp)))
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x9e'.

Searching in several places I find that this might be due to corrupted checkpoint files from Google Drive.
Does someone have the same issue?

About the meaning of "n" of "DDAD-S-n"

Hello, I'd like to ask about the meaning of 10 in DDAD-S-10 in your paper. Your paper describes it as "n refers to the number of denoising iterations", but isn't the number of denoising iterations 1000? I don't see a 10 setting in the code, or do I iterate the training process 10 times to get it?

Why nothing happens when running eval

~/diffusion/DDAD$ python main.py --eval True
Class: screw w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16

This seems to be because the --eval is unknown in parse_args()
def parse_args(): cmdline_parser = argparse.ArgumentParser('DDAD') cmdline_parser.add_argument('-cfg', '--config', default= os.path.join(os.path.dirname(os.path.abspath(__file__)),'config.yaml'), help='config file') cmdline_parser.add_argument('--train', default= False, help='Train the diffusion model') cmdline_parser.add_argument('--detection', default= False, help='Detection anomalies') cmdline_parser.add_argument('--domain_adaptation', default= False, help='Domain adaptation') args, unknowns = cmdline_parser.parse_known_args() return args

Can I train my own dataset?

Time of training

Hi,

Thanks for releasing the code! I am using an RTXA6000 to run the main.py for the "Hazelnut" class of the MVTec dataset. It takes over 10 hours to finish the training. I do not have much experience in training a diffusion model. I wonder is it common to take such a long time to train the model and is it necessary to train so many epochs to achieve competitive performance on the MVTec dataset?

Thank you very much :) !

Some questions about the traininig time and the finally results

First thank you for your excellent work, while I met some questions when reproducing the results:

I find that the training time is extremely long. I used 3 3090 to train the model on the MVTec dataset and it took me almost 2 days to finish the whole class training process.
There is some large gap between the report results and the one I reproduced, like class carpet
AUROC: 0.6737560033798218
AUROC pixel level: 0.8384530544281006
threshold: 0.59561723
I kept almost all of the parameters unchanged in the config.yaml except for the batchsize, I wonder whether the setting is the same as the one you used and if not , how can I make the proper change to get the similar results.

Request for Fine-Tuning Parameters in MVTec's config.yaml

Hello, thank you for realease this awesome work.
Regarding the result of the Mvtec dataset mentioned in the paper, could you please provide some fine-tune parameters such as 'load_chp','DA_epochs','w',and so on?
I've encountered lower results for certain classes. I think it might be due to my parameter settings.

If it's not possible, that's okay too. Thank you very much.

Some questions about training hyperparameters

Why set epochs of pre-training Unet to a large value ( > 1000), such as 1500,2000,3000? The amount of images in my own trainset is relatively large, and it will take a lot of time to perform pre-training with your setting. Otherwise, How to set epochs of fine-tuning the feature extractor. Because training loss, including pretraining loss, cannot provide useful information to judge the effectiveness of model in training. Thank you.

ModuleNotFoundError: No module named 'omegaconf'

The AUROC value of the paper cannot be achieved using official checkpoints.

Hello, I use the official checkpoints 2500 of carpet in MVTecAD, but the image-level AUROC cannot reach 99.3% of the paper.
Below is my experimental setup.

python main.py --detection True
Class: carpet w: 0 v: 1 load_chp: 2500 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=16
Detecting Anomalies...
AUROC: (97.6,98.8)
PRO: 93.8

I found that adjusting W=0 is when my current image-level AUROC is the highest. (97.6)
Is there something I didn't notice? Why can't I reach the value in the paper?

Question in making heat map

In the function anomaly_map.heat_map(), the anomaly map is calculate by combining the feature and pixel distances as follows

anomaly_map += f_d + config.model.v * (torch.max(f_d)/ torch.max(i_d)) * i_d

Then if f_d and i_d have more than one samples, torch.max(f_d) and torch.max(i_d) would give the maximum over all of the samples. Is there something wrong? I think the maximum should be sample-wise. Then it should be

anomaly_map += f_d + config.model.v * (torch.amax(f_d, (1,2,3), True)/ torch.amax(i_d, (1,2,3), True)) * i_d

Do I misunderstand it?

Has this article been accepted?

Hello, has this article been accepted?

problem about feat0

how to save feat0？ Does this mean that fine_tuning is not required in categories with FE_epoch =0?
In code feature_extractor.py the model "feat" is saved from 1. Does this mean that I should set DA_chp=1?

I can hardly repeat "carpet" experiment with w=0, load_chp=2500, and DA_chp=1.

I am curious at what step I went wrong. Looking forward to your reply.

The impact of Batch Size on the results during testing

When I change the Batch Size during testing I get different results
What is the reason?
Thanks

why am i not really training......

as follows:
(base) PS E:\DDAD-main> python main.py --train True
Class: fabric w: 2 v: 1 load_chp: 2500 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 , config.data.test_batch_size=8
Training...
Num params: 32952707
(base) PS E:\DDAD-main>

and then ,it stoped,and wait for my next command.......
and i found that when it comes to the code "for step, batch in enumerate(trainloader):" it cannot work,it just skipped and the whole project stopped

Simple version of patchify()

Sorry this is not issue, just some comment. It seems that anomaly_map.patchify() can be simplified as follows

def patchify_new(features):
    """Use AvgPool3d to patchify "features"
    """  
    patch_size = 3
    padding = (patch_size - 1) //2
    return torch.nn.AvgPool3d(
        kernel_size=(1, patch_size, patch_size),
        stride=1,
        padding=(0, padding,padding))(features.unsqueeze(1)).squeeze(1)

Of course, this function can not return the spatial information.
Please feel free to check and use it or just ignore it.

how can i use "visa" dataset

it's complex to change the code!
can you release a visa one?
please~

arimousa / ddad Goto Github PK

ddad's Issues

Recommend Projects

Recommend Topics

Recommend Org