Giter Club home page Giter Club logo

adversarial-attacks-pytorch's Introduction

Adversarial-Attacks-PyTorch

MIT License Pypi Latest Release Documentation Status Code style: black

Torchattacks is a PyTorch library that provides adversarial attacks to generate adversarial examples.

It contains PyTorch-like interface and functions that make it easier for PyTorch users to implement adversarial attacks.

import torchattacks
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
# If inputs were normalized, then
# atk.set_normalization_used(mean=[...], std=[...])
adv_images = atk(images, labels)

Additional Recommended Packages.

Citation. If you use this package, please cite the following BibTex (GoogleScholar):

@article{kim2020torchattacks,
title={Torchattacks: A pytorch repository for adversarial attacks},
author={Kim, Hoki},
journal={arXiv preprint arXiv:2010.01950},
year={2020}
}

🔨 Requirements and Installation

Requirements

  • PyTorch version >=1.4.0
  • Python version >=3.6

Installation

#  pip
pip install torchattacks

#  source
pip install git+https://github.com/Harry24k/adversarial-attacks-pytorch.git

#  git clone
git clone https://github.com/Harry24k/adversarial-attacks-pytorch.git
cd adversarial-attacks-pytorch/
pip install -e .

🚀 Getting Started

Precautions

  • All models should return ONLY ONE vector of (N, C) where C = number of classes. Considering most models in torchvision.models return one vector of (N,C), where N is the number of inputs and C is thenumber of classes, torchattacks also only supports limited forms of output. Please check the shape of the model’s output carefully.
  • The domain of inputs should be in the range of [0, 1]. Since the clipping operation is always applied after the perturbation, the original inputs should have the range of [0, 1], which is the general settings in the vision domain.
  • torch.backends.cudnn.deterministic = True to get same adversarial examples with fixed random seed. Some operations are non-deterministic with float tensors on GPU [discuss]. If you want to get same results with same inputs, please run torch.backends.cudnn.deterministic = True[ref].

Demos

  • Targeted mode

    • Random target label
      # random labels as target labels.
      atk.set_mode_targeted_random()
    • Least likely label
      # labels with the k-th smallest probability as target labels.
      atk.set_mode_targeted_least_likely(kth_min)
    • By custom function
      # labels obtained by mapping function as target labels.
      # shift all class loops one to the right, 1=>2, 2=>3, .., 9=>0
      atk.set_mode_targeted_by_function(target_map_function=lambda images, labels:(labels+1)%10)
    • By label
      atk.set_mode_targeted_by_label(quiet=True)
      # shift all class loops one to the right, 1=>2, 2=>3, .., 9=>0
      target_labels = (labels + 1) % 10
      adv_images = atk(images, target_labels)
    • Return to default
      atk.set_mode_default()
  • Save adversarial images

    # Save
    atk.save(data_loader, save_path="./data.pt", verbose=True)
    
    # Load
    adv_loader = atk.load(load_path="./data.pt")
  • Training/Eval during attack

    # For RNN-based models, we cannot calculate gradients with eval mode.
    # Thus, it should be changed to the training mode during the attack.
    atk.set_model_training_mode(model_training=False, batchnorm_training=False, dropout_training=False)
  • Make a set of attacks

    • Strong attacks
      atk1 = torchattacks.FGSM(model, eps=8/255)
      atk2 = torchattacks.PGD(model, eps=8/255, alpha=2/255, iters=40, random_start=True)
      atk = torchattacks.MultiAttack([atk1, atk2])
    • Binary search for CW
      atk1 = torchattacks.CW(model, c=0.1, steps=1000, lr=0.01)
      atk2 = torchattacks.CW(model, c=1, steps=1000, lr=0.01)
      atk = torchattacks.MultiAttack([atk1, atk2])
    • Random restarts
      atk1 = torchattacks.PGD(model, eps=8/255, alpha=2/255, iters=40, random_start=True)
      atk2 = torchattacks.PGD(model, eps=8/255, alpha=2/255, iters=40, random_start=True)
      atk = torchattacks.MultiAttack([atk1, atk2])

📃 Supported Attacks

The distance measure in parentheses.

Name Paper Remark
FGSM
(Linf)
Explaining and harnessing adversarial examples (Goodfellow et al., 2014)
BIM
(Linf)
Adversarial Examples in the Physical World (Kurakin et al., 2016) Basic iterative method or Iterative-FSGM
CW
(L2)
Towards Evaluating the Robustness of Neural Networks (Carlini et al., 2016)
RFGSM
(Linf)
Ensemble Adversarial Traning: Attacks and Defences (Tramèr et al., 2017) Random initialization + FGSM
PGD
(Linf)
Towards Deep Learning Models Resistant to Adversarial Attacks (Mardry et al., 2017) Projected Gradient Method
PGDL2
(L2)
Towards Deep Learning Models Resistant to Adversarial Attacks (Mardry et al., 2017) Projected Gradient Method
MIFGSM
(Linf)
Boosting Adversarial Attacks with Momentum (Dong et al., 2017) 😍 Contributor zhuangzi926, huitailangyz
TPGD
(Linf)
Theoretically Principled Trade-off between Robustness and Accuracy (Zhang et al., 2019)
EOTPGD
(Linf)
Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network" (Zimmermann, 2019) EOT+PGD
APGD
(Linf, L2)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks (Croce et al., 2020)
APGDT
(Linf, L2)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks (Croce et al., 2020) Targeted APGD
FAB
(Linf, L2, L1)
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack (Croce et al., 2019)
Square
(Linf, L2)
Square Attack: a query-efficient black-box adversarial attack via random search (Andriushchenko et al., 2019)
AutoAttack
(Linf, L2)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks (Croce et al., 2020) APGD+APGDT+FAB+Square
DeepFool
(L2)
DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks (Moosavi-Dezfooli et al., 2016)
OnePixel
(L0)
One pixel attack for fooling deep neural networks (Su et al., 2019)
SparseFool
(L0)
SparseFool: a few pixels make a big difference (Modas et al., 2019)
DIFGSM
(Linf)
Improving Transferability of Adversarial Examples with Input Diversity (Xie et al., 2019) 😍 Contributor taobai
TIFGSM
(Linf)
Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks (Dong et al., 2019) 😍 Contributor taobai
NIFGSM
(Linf)
Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks (Lin, et al., 2022) 😍 Contributor Zhijin-Ge
SINIFGSM
(Linf)
Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks (Lin, et al., 2022) 😍 Contributor Zhijin-Ge
VMIFGSM
(Linf)
Enhancing the Transferability of Adversarial Attacks through Variance Tuning (Wang, et al., 2022) 😍 Contributor Zhijin-Ge
VNIFGSM
(Linf)
Enhancing the Transferability of Adversarial Attacks through Variance Tuning (Wang, et al., 2022) 😍 Contributor Zhijin-Ge
Jitter
(Linf)
Exploring Misclassifications of Robust Neural Networks to Enhance Adversarial Attacks (Schwinn, Leo, et al., 2021)
Pixle
(L0)
Pixle: a fast and effective black-box attack based on rearranging pixels (Pomponi, Jary, et al., 2022)
LGV
(Linf, L2, L1, L0)
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity (Gubri, et al., 2022) 😍 Contributor Martin Gubri
SPSA
(Linf)
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks (Uesato, Jonathan, et al., 2018) 😍 Contributor Riko Naka
JSMA
(L0)
The Limitations of Deep Learning in Adversarial Settings (Papernot, Nicolas, et al., 2016) 😍 Contributor Riko Naka
EADL1
(L1)
EAD: Elastic-Net Attacks to Deep Neural Networks (Chen, Pin-Yu, et al., 2018) 😍 Contributor Riko Naka
EADEN
(L1, L2)
EAD: Elastic-Net Attacks to Deep Neural Networks (Chen, Pin-Yu, et al., 2018) 😍 Contributor Riko Naka
PIFGSM (PIM)
(Linf)
Patch-wise Attack for Fooling Deep Neural Network (Gao, Lianli, et al., 2020) 😍 Contributor Riko Naka
PIFGSM++ (PIM++)
(Linf)
Patch-wise++ Perturbation for Adversarial Targeted Attacks (Gao, Lianli, et al., 2021) 😍 Contributor Riko Naka

📊 Performance Comparison

As for the comparison packages, currently updated and the most cited methods were selected:

  • Foolbox: 611 citations and last update 2023.10.
  • ART: 467 citations and last update 2023.10.

Robust accuracy against each attack and elapsed time on the first 50 images of CIFAR10. For L2 attacks, the average L2 distances between adversarial images and the original images are recorded. All experiments were done on GeForce RTX 2080. For the latest version, please refer to here (code, nbviewer).

Attack Package Standard Wong2020Fast Rice2020Overfitting Remark
FGSM (Linf) Torchattacks 34% (54ms) 48% (5ms) 62% (82ms)
Foolbox* 34% (15ms) 48% (8ms) 62% (30ms)
ART 34% (214ms) 48% (59ms) 62% (768ms)
PGD (Linf) Torchattacks 0% (174ms) 44% (52ms) 58% (1348ms) 👑 ​Fastest
Foolbox* 0% (354ms) 44% (56ms) 58% (1856ms)
ART 0% (1384 ms) 44% (437ms) 58% (4704ms)
CW† (L2) Torchattacks 0% / 0.40
(2596ms)
14% / 0.61
(3795ms)
22% / 0.56
(43484ms)
👑 ​Highest Success Rate
👑 Fastest
Foolbox* 0% / 0.40
(2668ms)
32% / 0.41
(3928ms)
34% / 0.43
(44418ms)
ART 0% / 0.59
(196738ms)
24% / 0.70
(66067ms)
26% / 0.65
(694972ms)
PGD (L2) Torchattacks 0% / 0.41 (184ms) 68% / 0.5
(52ms)
70% / 0.5
(1377ms)
👑 Fastest
Foolbox* 0% / 0.41 (396ms) 68% / 0.5
(57ms)
70% / 0.5
(1968ms)
ART 0% / 0.40 (1364ms) 68% / 0.5
(429ms)
70% / 0.5
(4777ms)

* Note that Foolbox returns accuracy and adversarial images simultaneously, thus the actual time for generating adversarial images might be shorter than the records.

Considering that the binary search algorithm for const c can be time-consuming, torchattacks supports MutliAttack for grid searching c.

Additionally, I also recommend to use a recently proposed package, Rai-toolbox.

Attack Package Time/step (accuracy)
FGSM (Linf) rai-toolbox 58 ms (0%)
Torchattacks 81 ms (0%)
Foolbox 105 ms (0%)
ART 83 ms (0%)
PGD (Linf) rai-toolbox 58 ms (44%)
Torchattacks 79 ms (44%)
Foolbox 82 ms (44%)
ART 90 ms (44%)
PGD (L2) rai-toolbox 58 ms (70%)
Torchattacks 81 ms (70%)
Foolbox 82 ms (70%)
ART 89 ms (70%)

The rai-toolbox takes a unique approach to gradient-based perturbations: they are implemented in terms of parameter-transforming optimizers and perturbation models. This enables users to implement diverse algorithms (like universal perturbations and concept probing with sparse gradients) using the same paradigm as a standard PGD attack.

adversarial-attacks-pytorch's People

Contributors

buhua-liu avatar framartin avatar freed-wu avatar harry24k avatar hkunzhe avatar ignaciogavier avatar jaryp avatar jatanloya avatar khalooei avatar lukasstruppek avatar noppelmax avatar rikonaka avatar tao-bai avatar yijiangpang avatar zhijin-ge avatar zhuangzi926 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adversarial-attacks-pytorch's Issues

Supporting nn.BCEWithLogitsLoss() ?

I try a multi-label classification problem with torchattacks.FGSM(net, eps=0.1)
image

I don't know if this make sense, but considering adding support for different type of loss function such as nn.BCEWithLogitsLoss() may be good?

device not same problem.

Hello Harry,
Thanks for your code.
Here’s the situation I meet:
My server has two gpus, which are cuda: 0 and cuda: 1. However, if I send my model to cuda: 1, then attack would change it to cuda: 0 instead.

Code is here:
torchattacks/attack.py line 20
self.device = torch.device("cuda" if next(model.parameters()).is_cuda else "cpu")

Could it be okay to change it to:
self.device = next(model.parameters()).device

I test the code based on torch=1.4.0
This could help me maintain the same device in both my model and datasets.
Thanks a lot.

Ruqi Bai

AutoAttack

Hi, I feel confused about the multi-attack and target apgdt. (1) For the standard autoattack, do all images are attacked by 4 attacks respectively and calculate the average robust accuracy after all? (2) And for target apgdt, how the target label is found? I see that the number of target label is equal to the number of total class minus one for CIFAR-10. The question is in this line (

y_target = output.sort(dim=1)[1][:, -self.target_class]
). If I want to use it for the ImageNet, should I set n_target_classes = 999?

Looking forwards to your help! Thanks!

The problem of torchattack package

Hello, I try yo apply this package on my model and it reports this problem, could you give me any suggestions to revise it? Thank you for your time.

  File "white_box_attack.py", line 1070, in <module>
    adv_images = atk(images, labels)
  File "/home/wuman/.local/lib/python3.6/site-packages/torchattacks/attack.py", line 249, in __call__
    images = self.forward(*input, **kwargs)
  File "/home/wuman/.local/lib/python3.6/site-packages/torchattacks/attacks/pgd.py", line 62, in forward
    retain_graph=False, create_graph=False)[0]
  File "/home/wuman/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 225, in grad
    inputs, allow_unused, accumulate_grad=False)
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

Question about perturbation PGD

Hello,

Firstly thanks for your great work!

I have a question about eps in atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)

It is showed that eps (float): maximum perturbation. (DEFALUT : 0.3).

So I want to know does that mean eps = maximal l-infinity norm of adv - img ? cuz I've run some experiments by PGD and the l-infity norm I calculated significantly exceeded the eps value.

So I really want to know is it because I got some mistakes in my calculation ? or eps: the maximal perturbation has some different meaning

No module named 'scipy._lib.six

When importing torchattacks using import torchattacks I have got an error stating No module named 'scipy._lib.six' for a freshly install scipy 1.6.3.
Installing scipy 1.4 resolves the problem.

AutoAttack not working properly

It returns an error during a computation of a pairwise distance, because the batch size of the adversarial images isn't the same of the original input. I am using a batch size of 64. It seems to work only using a batch size of 1.

/opt/conda/lib/python3.7/site-packages/torchattacks/attack.py in __call__(self, *input, **kwargs)
    321             self.model.eval()
    322 
--> 323         images = self.forward(*input, **kwargs)
    324 
    325         if given_training:

/opt/conda/lib/python3.7/site-packages/torchattacks/attacks/autoattack.py in forward(self, images, labels)
     79         images = images.clone().detach().to(self.device)
     80         labels = labels.clone().detach().to(self.device)
---> 81         adv_images = self.autoattack(images, labels)
     82 
     83         return adv_images

/opt/conda/lib/python3.7/site-packages/torchattacks/attack.py in __call__(self, *input, **kwargs)
    321             self.model.eval()
    322 
--> 323         images = self.forward(*input, **kwargs)
    324 
    325         if given_training:

/opt/conda/lib/python3.7/site-packages/torchattacks/attacks/multiattack.py in forward(self, images, labels)
     49 
     50         for _, attack in enumerate(self.attacks):
---> 51             adv_images = attack(images[fails], labels[fails])
     52 
     53             outputs = self.model(adv_images)

/opt/conda/lib/python3.7/site-packages/torchattacks/attack.py in __call__(self, *input, **kwargs)
    321             self.model.eval()
    322 
--> 323         images = self.forward(*input, **kwargs)
    324 
    325         if given_training:

/opt/conda/lib/python3.7/site-packages/torchattacks/attacks/apgd.py in forward(self, images, labels)
     59         images = images.clone().detach().to(self.device)
     60         labels = labels.clone().detach().to(self.device)
---> 61         _, adv_images = self.perturb(images, labels, cheap=True)
     62 
     63         return adv_images

/opt/conda/lib/python3.7/site-packages/torchattacks/attacks/apgd.py in perturb(self, x_in, y_in, best_loss, cheap)
    240                     if ind_to_fool.numel() != 0:
    241                         x_to_fool, y_to_fool = x[ind_to_fool].clone(), y[ind_to_fool].clone()
--> 242                         best_curr, acc_curr, loss_curr, adv_curr = self.attack_single_run(x_to_fool, y_to_fool)
    243                         ind_curr = (acc_curr == 0).nonzero().squeeze()
    244                         #

/opt/conda/lib/python3.7/site-packages/torchattacks/attacks/apgd.py in attack_single_run(self, x_in, y_in)
    111         for _ in range(self.eot_iter):
    112             with torch.enable_grad():
--> 113                 logits = self.model(x_adv) # 1 forward pass (eot_iter = 1)
    114                 loss_indiv = criterion_indiv(logits, y)
    115                 loss = loss_indiv.sum()

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/tmp/ipykernel_33/3929415858.py in forward(self, img1)
     49         emb2 = self.get_embedding(self.img2)
     50 
---> 51         dist = F.pairwise_distance(emb1, emb2, keepdim=True)
     52         sim = 1 - F.cosine_similarity(emb1, emb2, dim=1).unsqueeze(dim=1)
     53 

RuntimeError: The size of tensor a (26) must match the size of tensor b (64) at non-singleton dimension 0

Ragarding the attacking mode in Class Attack.

Hi Harry, I see the below codes and feel a little confused.
When performing untargeted attacks, self._targeted = 1

cost = self._targeted*loss(outputs, labels)
grad = torch.autograd.grad(cost, images,retain_graph=False, create_graph=False)[0]
adv_images = images + self.eps*grad.sign()

Is it easier to understand to set self._targeted = -1 for untargeted attacks?
and modify corresponding lines

cost = self._targeted*loss(outputs, labels)
grad = torch.autograd.grad(cost, images,retain_graph=False, create_graph=False)[0]
adv_images = images - self.eps*grad.sign()

How to Change Distance Measure in FGSM

Hi harry,

I have run the FGSM with default distance measure Linf. I can see there is PGDL2 class, however, I couldn't find FGSML2, How can I change it to L2 norm.

Pytorch moved `zero_gradients` out of gradcheck (apparently?)

installing torchattacks from scratch on a fresh HPC environment, I hit the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.conda/envs/stabn/lib/python3.8/site-packages/torchattacks/__init__.py", line 17, in <module>
    from .attacks.fab import FAB
  File ".../.conda/envs/stabn/lib/python3.8/site-packages/torchattacks/attacks/fab.py", line 12, in <module>
    from torch.autograd.gradcheck import zero_gradients
ImportError: cannot import name 'zero_gradients' from 'torch.autograd.gradcheck' (.../.conda/envs/stabn/lib/python3.8/site-packages/torch/autograd/gradcheck.py)

Notably, when investigating the gradcheck.py file, zero_gradients is no longer defined within and a significant refactor was mentioned in the relasenotes for pytorch version 1.9 17 days ago

I think I can just use the LTS release which seems not to have this change, but I think it's notable enough to raise as an issue.

using carlini wagner l_inf condition

Thank you for the wonderful code. I was going through the repo and realized we do not have an implementation for Carlini Wagner l_inf attack. I was hoping to use the repo and also include l_inf condition. Can you please help me a bit, a few pointers in order to code it would be great

Advice for calculating the time in demo code

Firstly, thanks for the nice contributions.

I have a little advice for your demo code :

https://github.com/Harry24k/adversarial-attacks-pytorch/blob/master/demos/White%20Box%20Attack%20(ImageNet).ipynb

The third part:3. Adversarial Attack
in the loop
"
for images, labels in data_loader:
start = time.time()
……
"
I think 'start' should be outside of the loop. I know the demo only have an image in the dataset, but when I apply it to other common datasets with batch size larger than 1, it may lead to some confusion.

Question about normalize during inference(attack) phase

Hi author(s),

I have a question about the normalize operation with torchattacks.

During the training phase, I use the normalize operation on the training data, for example,


transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

Therefore, I need to perform the normalize operation during the inference (test, attack) phase.
However, in torchattacks, the input data must be in the range of [0, 1].
So how to deal with this problem?

Best wishes,
Gavin

Be careful with training mode

When change training mode of the given model after initializing Attack object in some use cases, training mode will be changed inadvertently. See the following toy example:

import torchattacks
# initialize model
print(model.training)  # model.training = True
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
model.eval()  # model.training = False
adversarial_images = atk(images, labels)
print(model.training)  # model.training = True

As the documentation says:

It temporarily changes the model’s training mode to test by .eval() only during an attack process.

The current implementation sets the training attribute based on the given model in initializing Attack object rather than the current model. I think it's better to add this into precautions or this PR will fix the issue.

Error on Running the Cifar10 demo jynb

Hello ,

I am running the cifar10 demo and I am having this error once I load the model.pth checkpoint saved.

RuntimeError Traceback (most recent call last)
in
----> 1 model.load_state_dict(torch.load("/home/jovyan/.cache/torch/checkpoints/resnext50_32x4d-7cdf4587.pth"))
2 model = model.eval()

/srv/conda/envs/notebook/lib/python3.7/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
828 if len(error_msgs) > 0:
829 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 830 self.class.name, "\n\t".join(error_msgs)))
831 return _IncompatibleKeys(missing_keys, unexpected_keys)
832

RuntimeError: Error(s) in loading state_dict for Target:
Missing key(s) in state_dict: "conv_layer.0.weight", "conv_layer.0.bias", "conv_layer.1.weight", "conv_layer.1.bias", "conv_layer.4.weight", "conv_layer.4.bias", "conv_layer.5.weight", "conv_layer.5.bias", "conv_layer.7.weight", "conv_layer.7.bias", "conv_layer.8.weight", "conv_layer.8.bias", "conv_layer.11.weight", "conv_layer.11.bias", "conv_layer.12.weight", "conv_layer.12.bias", "conv_layer.14.weight", "conv_layer.14.bias", "conv_layer.15.weight", "conv_layer.15.bias", "conv_layer.18.weight", "conv_layer.18.bias", "conv_layer.19.weight", "conv_layer.19.bias", "conv_layer.21.weight", "conv_layer.21.bias", "conv_layer.22.weight", "conv_layer.22.bias", "conv_layer.24.weight", "conv_layer.24.bias"

Thank you in advance :)

The attack accuracy on Cifra10

Hi! I trained resnet18 on the Cifra10 and used FGSM to generate adversarial images.
But I found that the accuracy didn't decrease significantly when the eps was set at (2/255,4/255,8/255).
The accuracy on the clean images is 93%. When I add FGSM, the accuary is: 55.8% for eps=2/255, 55.3% for eps=4/255, 53.5% for 8/255.
I think when the eps changes, the accuracy should drop drastically. This test result makes me feel a little puzzled。

Weird colors in the output of my attack

I am using PyTorch CIFAR-10 dataset, with the following transformations:

transform_train = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

The clean images are showing like this:
Screen Shot 2020-11-09 at 5 26 46 PM
The attacked images with attack = CW(net, c=0.0004, kappa=0, steps=10, lr=0.01), are like this:
Screen Shot 2020-11-09 at 5 27 06 PM

From the original paper, we expect the images to have few to none visual difference, but I am seeing a big color offset. Do you have any hint on what I could be doing wrong in my code?

Thank you for the library, it is of GREAT HELP.

How can I attack a image which under normalization?

How can I add perturbation on a image which process by normalization from dataloader. Many model was trained under norm, so I must keep this processing. I have saw the comments in the code:

images: :math:(N, C, H, W) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
It means that it is not supported the norm yet?

Top-K Attack?

Initially searching through the code and documentation, it's not clear to me if a top-k attack (i.e., the true class should not occur within the top-k predictions) is implemented.

Cannot save adversarial examples when using MultiAttack

Hi,

I encountered errors when using MultiAttack.
Here is my code segment:

# cifar100_eval_loader = ... (initialize my dataloader)
eps = 8 / 255
alpha = 2 / 255
atk1 = TIFGSM(srcnet, eps=eps, alpha=alpha, steps=40)
atk2 = DIFGSM(srcnet, eps=eps, alpha=alpha, steps=40)
atk = MultiAttack([atk1, atk2])
atk.set_return_type('int')  # Save as integer.
atk.save(data_loader=cifar100_eval_loader, save_path=save_path, verbose=True)

However, the following error messages occurs:

- Save complete!
Traceback (most recent call last):
  File "/tmp2/attack/src/adv_attack.py", line 162, in <module>
    atk.save(data_loader=cifar100_eval_loader, save_path=save_path, verbose=True)
  File "/home/tsunghan/miniconda3/envs/SPML/lib/python3.9/site-packages/torchattacks/attacks/multiattack.py", line 105, in save
    rob_acc, l2, elapsed_time = super().save(data_loader, save_path, verbose, return_verbose)
TypeError: cannot unpack non-iterable NoneType object

I wonder that whether the MultiAttack.save() method contains bugs?
Thanks!

What does w.detach_() do in cw.py?

Hi, I'm a newbie just started to study adversarial examples.

I'm curious what does w.detach_()(line 76) do in torchattacks/attacks/cw.py?

I think torch should record all operation applied to w.

Reason for cloning the images and labels ?

Hello,
thank you for the library.

I was wondering why you clone and detach the images as the first step of many attacks?
cloning doubles the amount of memory used in GPU which reduces speed.
wouldn't zeroing the gradient for the images be sufficient?

use SciPy > = 1.8.0

It is recommended to use SciPy > = 1.8.0 to be compatible with other algorithm libraries. The lower version of SciPy will cause the following statement to report an error.

#D:\anaconda3\envs\nn\Lib\site-packages\torchattacks\attacks\_differential_evolution.py
from scipy.optimize.optimize import _status_message

==> from scipy.optimize._optimize import _status_message
Art has the same problem and is being modified.

CW Attack not working as desired

I am using CW targeted attack on a Convolution Neural Network trained for FashionMNIST. However I do not see any change in accuracy after the CW attack. I have tried using c =1, 10 and 100. I also tried using different learning rates like 0.001 and 0.0001. I have used different steps, like 100 and 1000.
However when I use targeted FGSM and PGD attacks, those work fine.

I need your support to help me figure out why the CW attack isn't working.
File with my code is attached.
FMNIST_CNN.txt

Question about adversarial training

If we specify the attack used in adversarial training out of the training loop as in the MNIST adversarial training demo, will the attack model parameters be updated along with the training?

Waiting for the update of JSMA attack method

hello,I'm a junior.I am learning adv_examples recently.It is very lucky to find this project, it helps me a lot.
I just wondering whether this project will support JSMA attack method,hahaha.
Waiting for the update of JSMA attack method

can the model (attacked) be trained with image augmentation?

Hi Harry,

Thanks for your awesome lib. When I use your code, I find that the attack module requires the input images to be in the range of [0,1]. Does it mean that the model to be attacked has to be trained with the input with a range of [0,1]? What if I have a model that is trained with images that are augmented? Is there a way to include the image transformation into the attack process?

Best,
Hao

small implementation error in DeepFool attack

I am quite new to adverserial attacks, so the current behaviour might be intended.

I currently use your DeepFool implementation as an estimate for the distance to the decision boundary of a classifier, as suggested in the paper: https://arxiv.org/abs/2002.01810v1

If the model already predicts a wrong label for a given input sample, then the perturbation returned by the DeepFool attack is the original image (

), whereas I think that just returning torch.zeros_like(image) would be formally correct.

Changing this would not only correct the estimate on the decision boundary, but also the doubling of the final adversarial sample (image + perturbation) if the classifier misclassified from the start.

cifar10 attacks question

Standard Accuracy -- No confrontation training, the accuracy rate is 40%, only 30%

# !/usr/bin/env python
# -- coding: utf-8 --
# @Author zengxiaohui
# Datatime:8/26/2021 8:56 AM
# @File:test_FGSM
import torch
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim
import torch.nn as nn
from tqdm import tqdm

from python_developer_tools.cv.utils.torch_utils import init_seeds
from python_developer_tools.cv.train.对抗训练.adversarialattackspytorchmaster.torchattacks import *

transform = transforms.Compose(
    [transforms.ToTensor(),# ToTensor : [0, 255] -> [0, 1]
     # transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
     ])

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


def shufflenet_v2_x0_5(nc, pretrained):
    model_ft = torchvision.models.shufflenet_v2_x0_5(pretrained=pretrained)
    num_ftrs = model_ft.fc.in_features
    model_ft.fc = nn.Linear(num_ftrs, nc)
    return model_ft


if __name__ == '__main__':
    root_dir = "/home/zengxh/datasets"
    # os.environ['CUDA_VISIBLE_DEVICES'] = '1'
    epochs = 50
    batch_size = 1024
    num_workers = 8
    classes = 10

    init_seeds(1024)

    trainset = torchvision.datasets.CIFAR10(root=root_dir, train=True, download=True, transform=transform)
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=num_workers,
                                              pin_memory=True)

    testset = torchvision.datasets.CIFAR10(root=root_dir, train=False, download=True, transform=transform)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=num_workers)

    model = shufflenet_v2_x0_5(classes, True)
    model.cuda()
    model.train()

    criterion = nn.CrossEntropyLoss()
    # SGD with momentum
    optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)

    atks = [
        FGSM(model, eps=8 / 255),
        BIM(model, eps=8 / 255, alpha=2 / 255, steps=100),
        RFGSM(model, eps=8 / 255, alpha=2 / 255, steps=100),
        CW(model, c=1, lr=0.01, steps=100, kappa=0),
        PGD(model, eps=8 / 255, alpha=2 / 225, steps=100, random_start=True),
        PGDL2(model, eps=1, alpha=0.2, steps=100),
        EOTPGD(model, eps=8 / 255, alpha=2 / 255, steps=100, eot_iter=2),
        FFGSM(model, eps=8 / 255, alpha=10 / 255),
        TPGD(model, eps=8 / 255, alpha=2 / 255, steps=100),
        MIFGSM(model, eps=8 / 255, alpha=2 / 255, steps=100, decay=0.1),
        VANILA(model),
        GN(model, sigma=0.1),
        APGD(model, eps=8 / 255, steps=100, eot_iter=1, n_restarts=1, loss='ce'),
        APGD(model, eps=8 / 255, steps=100, eot_iter=1, n_restarts=1, loss='dlr'),
        APGDT(model, eps=8 / 255, steps=100, eot_iter=1, n_restarts=1),
        FAB(model, eps=8 / 255, steps=100, n_classes=10, n_restarts=1, targeted=False),
        FAB(model, eps=8 / 255, steps=100, n_classes=10, n_restarts=1, targeted=True),
        Square(model, eps=8 / 255, n_queries=5000, n_restarts=1, loss='ce'),
        AutoAttack(model, eps=8 / 255, n_classes=10, version='standard'),
        OnePixel(model, pixels=5, inf_batch=50),
        DeepFool(model, steps=100),
        DIFGSM(model, eps=8 / 255, alpha=2 / 255, steps=100, diversity_prob=0.5, resize_rate=0.9)
    ]

    bestatk = None
    bestRobustAcc = 0
    for atk in atks:
        print("-" * 70)
        print(atk)
        correct = 0
        model.eval()
        for j, (images, labels) in tqdm(enumerate(trainloader)):
            adv_images = atk(images, labels)
            outputs = model(adv_images.cuda())
            _, predicted = torch.max(outputs.data, 1)
            correct += (predicted.cpu() == labels).sum()
        bestRobustAcc_now = correct / len(trainset)
        print('Robust Accuracy: %.4f %%' % (bestRobustAcc_now))
        if bestRobustAcc < bestRobustAcc_now:
            bestatk = atk
            bestRobustAcc = bestRobustAcc_now

    for epoch in range(epochs):
        train_loss = 0.0
        for i, (inputs, labels) in tqdm(enumerate(trainloader)):
            inputs = atk(inputs, labels).cuda()
            labels = labels.cuda()

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            outputs = model(inputs)
            # loss
            loss = criterion(outputs, labels)
            # backward
            loss.backward()
            # update weights
            optimizer.step()

            # print statistics
            train_loss += loss

        scheduler.step()
        print('%d/%d loss: %.3f' % (epochs, epoch + 1, train_loss / len(trainset)))

    # Standard Accuracy
    correct = 0
    model.eval()
    for j, (images, labels) in tqdm(enumerate(testloader)):
        outputs = model(images.cuda())
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted.cpu() == labels).sum()
    print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / len(testset)))

    # Robust Accuracy
    correct = 0
    model.eval()
    atk.set_training_mode(training=False)
    for j, (images, labels) in tqdm(enumerate(testloader)):
        images = atk(images, labels).cuda()
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted.cpu() == labels).sum()
    print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / len(testset)))

Cannot set requires_grad in deepfool

The deepfool method fails due the error below

  File "/dccstor/ddig/jbtang/tools/anaconda3/envs/python3/lib/python3.6/site-packages/torchattacks/attacks/deepfool.py", line 29, in forward
    image.requires_grad = True
RuntimeError: you can only change requires_grad flags of leaf variables.

normalize question

hi,i have a question about normalize,your demo White Box Attack (ImageNet).ipynb use the class Normalize you designed,it is same as the normalize in the transforms.Normalize?

If they are different, what are their differences?

Device error

Hi Harry

I just tried with the demo code, however it pop up the following error.

torchattacks/attack.py", line 26, in init
self.device = next(model.parameters()).device
StopIteration:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.