koukyosyumei / aijack Goto Github PK
View Code? Open in Web Editor NEWSecurity and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
License: Apache License 2.0
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
License: Apache License 2.0
Hi,
Thanks for uploading the code. I was trying to execute this with test data and was using the poison attack example notebook. I am getting this error: ValueError: zero-dimensional arrays cannot be concatenated in line 118 in poison_attack.py. Any suggestions?
I tried using dpsgd encapsulated in aijack to train models on different datasets, and found that its time cost is lower than not using differential privacy. Does aijack use a tool to accelerate model training, or is there a problem with my code setup. I would appreciate it very much if you could answer my question.
Using type() instead of isinstance() for a typecheck.
if type(paillier_array) == list:
In the section Supported Algorithms
, inside the first row of the table, FedMD
should be changed to FedKD
.
Unused variable 'i'
for i, data in enumerate(dataloader, 0):
I want to create a virtual environment in anaconda on windows to install aijack, how can I do it?
Hi,
After executing this:
!pip install git+https://github.com/Koukyosyumei/AIJack
I faced this error:
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Can you help me?
Function parameter 'orders' should be passed by const reference.
std::vector<double> orders,
Lambda may not be necessary
np.vectorize(lambda x: pk.encrypt(x))(grad.detach().numpy())
The installation fails most of the time. Please help.
Error:
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting aijack
Downloading aijack-0.0.1a1.tar.gz (127 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.5/127.5 KB 1.9 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: aijack
error: subprocess-exited-with-error
× Building wheel for aijack (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for aijack (pyproject.toml) ... error
ERROR: Failed building wheel for aijack
Failed to build aijack
ERROR: Could not build wheels for aijack, which is required to install pyproject.toml-based projects
Hi developer.
When I am running FedAVG, setting use_gradient=False
in FedAVGAPI
(a.k.a. updating the global model with clients' parameters) and initializing a FedAVGServer
object with a list of clients' IDs (according to the comment in source code, client
can be assigned with clients' IDs). A bug clearly emerged because in the function receive_local_parameters()
from FedAVGServer
, the lines 96-98
:
def receive_local_parameters(self):
"""Receive local parameters"""
self.uploaded_parameters = [c.upload_parameters() for c in self.clients]
The function above only supports the client object itself (for c in self.clients
, where c
is a client object). The same bug will happen in receive_local_gradients()
, the lines 90-94
, even thought I haven't try it with clients' IDs.
For a temporary fix-up, I modified the source code in FedAVGAPI
, the lines 67-81
, from:
def run(self):
self.server.force_send_model_state_dict = True
self.server.distribute()
self.server.force_send_model_state_dict = False
for i in range(self.num_communication):
self.local_train(i)
self.server.receive(use_gradients=self.use_gradients)
if self.use_gradients:
self.server.update_from_gradients()
else:
self.server.update_from_parameters()
self.server.distribute()
self.custom_action(self)
to
def run(self):
self.server.force_send_model_state_dict = True
self.server.distribute()
self.server.force_send_model_state_dict = False
for i in range(self.num_communication):
self.local_train(i)
##############
# For FedAVGServer:
# reassigned server.clients with a list of client objects, instead of a list of IDs
if not self.use_gradients:
self.server.clients = self.clients
##############
self.server.receive(use_gradients=self.use_gradients)
if self.use_gradients:
self.server.update_from_gradients()
else:
self.server.update_from_parameters()
self.server.distribute()
self.custom_action(self)
In this way, the initialization in FedAVGServer.clients
seems a little redundant, because FedAVGAPI
will reinitialize the whole list of clients in the FedAVGServer
object.
Hi, I am using the below code to try AIJack.
import copy
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from numpy import e
from matplotlib import pyplot as plt
import torch.optim as optim
from tqdm.notebook import tqdm
from aijack.collaborative.fedavg import FedAVGAPI, FedAVGClient, FedAVGServer
from aijack.attack.inversion import GradientInversionAttackServerManager
from torch.utils.data import DataLoader, TensorDataset
from aijack.utils import NumpyDataset
import warnings
warnings.filterwarnings("ignore")
class LeNet(nn.Module):
def __init__(self, channel=3, hideen=768, num_classes=10):
super(LeNet, self).__init__()
act = nn.Sigmoid
self.body = nn.Sequential(
nn.Conv2d(channel, 12, kernel_size=5, padding=5 // 2, stride=2),
nn.BatchNorm2d(12),
act(),
nn.Conv2d(12, 12, kernel_size=5, padding=5 // 2, stride=2),
nn.BatchNorm2d(12),
act(),
nn.Conv2d(12, 12, kernel_size=5, padding=5 // 2, stride=1),
nn.BatchNorm2d(12),
act(),
)
self.fc = nn.Sequential(nn.Linear(hideen, num_classes))
def forward(self, x):
out = self.body(x)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
def prepare_dataloader(path="MNIST/.", batch_size=64, shuffle=True):
at_t_dataset_train = torchvision.datasets.MNIST(
root=path, train=True, download=True
)
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
)
dataset = NumpyDataset(
at_t_dataset_train.train_data.numpy(),
at_t_dataset_train.train_labels.numpy(),
transform=transform,
)
dataloader = torch.utils.data.DataLoader(
dataset, batch_size=batch_size, shuffle=shuffle, num_workers=0
)
return dataloader
device = torch.device("cpu")
dataloader = prepare_dataloader()
for data in dataloader:
xs, ys = data[0], data[1]
break
x = xs[:1]
y = ys[:1]
fig = plt.figure(figsize=(1, 1))
plt.axis("off")
plt.imshow(x.detach().numpy()[0][0], cmap="gray")
plt.show()
batch_size = 11
x_batch = xs[:batch_size]
y_batch = ys[:batch_size]
fig = plt.figure(figsize=(3, 2))
for bi in range(batch_size):
ax = fig.add_subplot(1, batch_size, bi + 1)
ax.imshow(x_batch[bi].detach().numpy()[0], cmap="gray")
ax.axis("off")
plt.tight_layout()
plt.show()
torch.manual_seed(7777)
shape_img = (28, 28)
num_classes = 10
channel = 1
hidden = 588
num_seeds = 5
criterion = nn.CrossEntropyLoss()
from aijack.attack.inversion import GradientInversion_Attack
# torch.cuda.empty_cache()
net = LeNet(channel=channel, hideen=hidden, num_classes=num_classes).to(device)
pred = net(x_batch.to(device))
loss = criterion(pred, y_batch.to(device))
received_gradients = torch.autograd.grad(loss, net.parameters())
received_gradients = [cg.detach() for cg in received_gradients]
gradinversion = GradientInversion_Attack(
net,
(1, 28, 28),
num_iteration=10,
lr=1e2,
log_interval=0,
optimizer_class=torch.optim.SGD,
distancename="l2",
optimize_label=False,
bn_reg_layers=[net.body[1], net.body[4], net.body[7]],
group_num=5,
tv_reg_coef=0.00,
l2_reg_coef=0.0001,
bn_reg_coef=0.001,
gc_reg_coef=0.001,
)
result = gradinversion.group_attack(received_gradients, batch_size=batch_size)
fig = plt.figure(figsize=(30, 20))
for bid in range(batch_size):
ax1 = fig.add_subplot(1, batch_size, bid + 1)
ax1.imshow((sum(result[0]) / len(result[0])
).detach().cpu().numpy()[bid][0], cmap="gray")
ax1.axis("off")
plt.tight_layout()
plt.show()
It throws the below error. But when I set batch_size
to any value less than or equal to 10, I don't get this error. Can anyone tell me what's wrong with this?
RuntimeError Traceback (most recent call last)
Cell In[26], line 28
9 received_gradients = [cg.detach() for cg in received_gradients]
11 gradinversion = GradientInversion_Attack(
12 net,
13 (1, 28, 28),
(...)
25 gc_reg_coef=0.001,
26 )
---> 28 result = gradinversion.group_attack(received_gradients, batch_size=batch_size)
31 fig = plt.figure(figsize=(30, 20))
32 for bid in range(batch_size):
File ~/dynamofl/venv/lib/python3.8/site-packages/aijack/attack/inversion/gradientinversion.py:414, in GradientInversion_Attack.group_attack(self, received_gradients, batch_size)
411 group_optimizer = []
413 for _ in range(self.group_num):
--> 414 fake_x, fake_label, optimizer = _setup_attack(
415 self.x_shape,
416 self.y_shape,
417 self.optimizer_class,
418 self.optimize_label,
419 self.pos_of_final_fc_layer,
420 self.device,
421 received_gradients,
...
---> 53 fake_label = fake_label.reshape(batch_size)
54 fake_label = fake_label.to(device)
55 return fake_label
RuntimeError: shape '[11]' is invalid for input of size 10
We would like to support MPI.
Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
sphinx-apidoc -f -o ./docs/source ./src/aijack
Hi developers!
When I'm installing aijack
:
apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"
pip install aijack
The last line pip install aijack
doesn't work for me (on a Linux server and my Windows 11 PC), but Colab works pretty fine, I've tried several models :)
Can I get any help please 🤖
Function parameter 'orders' should be passed by const reference.
std::vector<int> orders,
I followed the tutorial in the document to run the program,but all four methods in Reconstruct Single Data encounter the same error.
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
As described above, This error occurs after the local iteration is complete. Then I checked the code and found the following code in GradientInversion_Attack line306
distance.backward(retain_graph=False)
I don't know if retain_gragph=False is the core cause of this bug, but I really don't have a better solution right now.
Hi, thank you very much for your work. I am more interested in gradient inversion attacks, especially the work from the paper "See through Gradients: Image Batch Recovery via GradInversion". Referring to your documentation, I got good results with a smaller batch size on the MNIST dataset, but switched to RGB image datasets like cifar100, the effect was very unsatisfactory. I would like to ask if you have successfully reproduced the effect in the paper or have any suggestions for the setting of hyperparameters? I'll be grateful if you can reply as soon as possible.
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
Hi developer.
When using GradientInversion_Attack.group_attack()
, my GPU memory usage keeps accumulating (11441MiB) until my program is killed by itself.
Based on my understanding, the function should release allocated GPU memory once the attacker finished one attack.
Is there any way to prevent this issue?
Hi,
In the example_poison_attack code this line:
xc_attacked, log = attacker.attack(xc, 1, X_valid, y_valid_, num_iterations=200)
Can you please explain why we pass 1 instead of passing the actual value of yc?
I'm training to use your membership inference attack to evalute my own model but i started first by running the code you provided to be familiar with every line.
I got wierd results (i did not change anything in your code)
train_accuracy: 0.996
test_accuracy: 0.5535 (very low accuracy for cifar10)
and
overall auc is 0.519427 (random reslut which is not expected for a model runningh without dp-sgd)
can you please explain what is wrong?
Building wheels for collected packages: aijack
Building wheel for aijack (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for aijack (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [358 lines of output]
ERROR: Failed building wheel for aijack
Failed to build aijack
ERROR: Could not build wheels for aijack, which is required to install pyproject.toml-based projects
We are trying to execute the example_poison_attack.ipynb code, and we faced this error:
FileNotFoundError Traceback (most recent call last)
in ()
----> 1 X_train = np.load("data/X_train.npy", allow_pickle=True)
2 y_train = np.load("data/y_train.npy", allow_pickle=True)
3 X_valid = np.load("data/X_valid.npy", allow_pickle=True)
4 y_valid = np.load("data/y_valid.npy", allow_pickle=True)
/usr/local/lib/python3.7/dist-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
415 own_fid = False
416 else:
--> 417 fid = stack.enter_context(open(os_fspath(file), "rb"))
418 own_fid = True
419
FileNotFoundError: [Errno 2] No such file or directory: 'data/X_train.npy'
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.