rakutentech / stadv Goto Github PK

Spatially Transformed Adversarial Examples with TensorFlow

License: MIT License

Makefile 0.63% Python 99.37%

stadv's Introduction

stAdv: Spatially Transformed Adversarial Examples with TensorFlow

Deep neural networks have been shown to be vulnerable to adversarial examples: very small perturbations of the input having a dramatic impact on the predictions. In this package, we provide a TensorFlow implementation for a new type of adversarial attack based on local geometric transformations: Spatially Transformed Adversarial Examples (stAdv).

Our implementation follows the procedure from the original paper:

Spatially Transformed Adversarial Examples
Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, Dawn Song
ICLR 2018 (conference track), arXiv:1801.02612

If you use this code, please cite the following paper for which this implementation was originally made:

Robustness of Rotation-Equivariant Networks to Adversarial Perturbations
Beranger Dumont, Simona Maggio, Pablo Montalvo
ICML 2018 Workshop on "Towards learning with limited labels: Equivariance, Invariance, and Beyond", arXiv:1802.06627

Installation

First, make sure you have installed TensorFlow (CPU or GPU version).

Then, to install the stadv package, simply run

$ pip install stadv

Usage

A typical use of this package is as follows:

Start with a trained network implemented in TensorFlow.
Insert the stadv.layers.flow_st layer in the graph immediately after the input layer. This is in order to perturb the input images according to local differentiable geometric perturbations parameterized with input flow tensors.
In the end of the graph, after computing the logits, insert the computation of an adversarial loss (to fool the network) and of a flow loss (to enforce local smoothness), e.g. using stadv.losses.adv_loss and stadv.losses.flow_loss, respectively. Define the final loss to be optimized as a combination of the two.
Find the flows which minimize this loss, e.g. by using an L-BFGS-B optimizer as conveniently provided in stadv.optimization.lbfgs.

An end-to-end example use of the library is provided in the notebook demo/simple_mnist.ipynb (see on GitHub).

Documentation

The documentation of the API is available at http://stadv.readthedocs.io/en/latest/stadv.html.

Testing

You can run all unit tests with

$ make init
$ make test

stadv's People

Contributors

Stargazers

Watchers

Forkers

kurnianggoro tartaruszen hongxin001 sunshine352 yanhui002 zxydi1992 d900016 praveern huyoboy haniehnaderi george1ee wangrun qyyparadox homles11 zoowagon yangzhou6666

stadv's Issues

Incorrect implementation of flow_st()

In my opinion, lines 114 - 117 in layers.py should be:

wa = (1. - (sampling_grid_x - x0)) * (1. - (sampling_grid_y - y0))
wb = (1. - (sampling_grid_x - x0)) * (1. - (y1 - sampling_grid_y))
wc = (1. - (x1 - sampling_grid_x)) * (1. - (sampling_grid_y - y0))
wd = (1. - (x1 - sampling_grid_x)) * (1. - (y1 - sampling_grid_y))

instead of:

wa = (x1 - sampling_grid_x) * (y1 - sampling_grid_y)
wb = (x1 - sampling_grid_x) * (sampling_grid_y - y0)
wc = (sampling_grid_x - x0) * (y1 - sampling_grid_y)
wd = (sampling_grid_x - x0) * (sampling_grid_y - y0)

according to eq. (1) in Xiao et al.

Am I missing something?

My question about L_adv

Look at above picture,this is from loss.py.I have a question that our goal is to maximize the distance between logits about target and logits without target,So i think it should be L_adv_2-L_adv_1 instead of L_adv_1-L_adv_2.
Am I missing something?

Gradients return NaN values for flow loss

When running this simple example gradient_val in lbfgs starts to contain only NaN values after a certain number of iterations. This causes the lbfgs solver to terminate with the message "ABNORMAL_TERMINATION_IN_LNSRCH" and to output a loss of NaN value.

import random

import numpy as np
import stadv
import tensorflow as tf

random.seed(0)
np.random.seed(0)

num_classes = 10
batch_size = 7
C = 1
H = 5
W = 5
tau_val = 0.05


def sample_net(x):
    left_ones = tf.ones((batch_size, H, 1, W))
    right_ones = tf.ones((batch_size, H, C, num_classes))

    bilinear_sum = tf.squeeze(
        tf.reduce_sum(
            tf.matmul(tf.matmul(left_ones, x), right_ones),
            1
        )
    )

    return bilinear_sum


test_images = np.random.random_sample((batch_size, H, W, C)).astype(np.float32)
target_labels = np.random.randint(0, num_classes, batch_size)

flows_x0 = np.random.random_sample((batch_size, 2, H, W))

images = tf.placeholder(tf.float32, shape=[None, H, W, C], name='images')
targets = tf.placeholder(tf.int64, shape=[None], name='targets')
flows = tf.placeholder(tf.float32, shape=[None, 2, H, W], name='flows')
tau = tf.placeholder_with_default(
    tf.constant(tau_val, dtype=tf.float32), shape=[], name='tau'
)

perturbed_images = stadv.layers.flow_st(images, flows, data_format='NHWC')
logits = sample_net(perturbed_images)

loss_adv = stadv.losses.adv_loss(logits, targets)
loss_flow = stadv.losses.flow_loss(flows)
loss = loss_adv + tau * loss_flow

with tf.Session() as sess:
    tf.global_variables_initializer().run()

    tf_results = stadv.optimization.lbfgs(
        loss,
        flows,
        flows_x0=flows_x0,
        feed_dict={images: test_images, targets: target_labels},
        sess=sess
    )

print(tf_results['loss'])
print(tf_results['info'])

Using the TensorFlow Debugger I was able to pinpoint the problem to the tf.sqrt of the flow_loss. This can be verified by setting tau_val = 0 (essentially disabling the flow_loss), which leads to convergence and a loss of 0.

Do you know how to fix this problem?

Testing in batch doesn't give adversarial examples

Hello Berangerd, so I tried testing in batches as per the suggestion provided by you. However the predicted labels for perturbed images don't match the selected random targets. Could you please guide me in how effectively generate spatial adversarial examples in batches?

Thank you!

How to gather slices for single channel images?

Hi Berangerd,

Thanks for releasing the code for spatial transformation.
I have a question for these rows for getting pixel value at corner coordinates in layer.py

Ia = tf.gather_nd(images, tf.stack([b, y0, x0], 3), name='Ia')

The shape of the indices will be [B, H, W, 3]. For color images with shape [B, H, W, 3], after gathering slices, it gets the output with shape [B, H, W, 3].

But for the singe channel images with shape [B, H, W, 1], I do not understand how to gather the slices given the indices [B, H, W, 3] via tf.gather_nd and get the output with [B, H, W, 1]?

Could you please guide me in understanding how this works?
Thanks for your help!

LBFGS learning rate

Hi,

I was trying to reproduce your results in pytorch and I am struggling to get proper results with LBFGS. The issue is that it is not really clear for me how you set the learning rate for LBFGS and what was the learning rate which was used to produce the results in the paper. It is neither mentioned in the paper nor I could figure this out from the code. Could you please help me with that?

Incorrect implementation of adv_loss()

In my opinion, line 115 in losses.py should be:

return tf.maximum(L_adv_1 - L_adv_2, kappa, name='L_adv')

instead of:

return tf.maximum(L_adv_1 - L_adv_2, - kappa, name='L_adv')

according to eq. (3) in Xiao et al.

Am I missing something?

Test the demo in batch

Hi Berangerd,
Can this demo be tested in batches? :)

A new bug

When i ran the program to calculate the ASR,i had a new problem.

There are 9913 clean images predicted successfully by model A.And i planned to attack these images to calculate the ASR.But when it was attacking the 2039th clean images,it stopped.And it looked like the memory leak.
So i want to finalize the graph to check the program.

And above picture shows that every iteration tf.gradients() can build a new node,and eventually there are too much node to run.
What do you think about it?

About a problem of the demo

I replaced your demo's model by the paper's experiment model A in your demo,and test the model on a single image.But the produced adversarial example is different from the original image very much.The tau is 0.05.How should i do?Thank you so much!!!

How to change the dimension while generating the flows?

What are the suitable values for the flows variable? We were trying to change the dimension parameter from 2 to 1. However this seems incorrect.

Please help, thanks!