Giter Club home page Giter Club logo

fair_flearn's Introduction

Fair Resource Allocation in Federated Learning

This repository contains the code and experiments for the paper:

Fair Resource Allocation in Federated Learning

ICLR '20

Preparation

Download Dependencies

pip3 install -r requirements.txt

Generate Datasets

See the README files in separate data/$dataset folders for instructions on preprocessing and/or sampling each dataset.

For example,

under fair_flearn/data/fmnist, we clearly describe how to generate and preprocess the Fashion MNIST dataset.

In order to run the following demo on the Vehicle dataset, please go to fair_flearn/data/vehicle, download, and generate the Vehicle dataset following the README file under that directory.

Get Started

Example: the Vehicle dataset

[We provide a quick demo on the Vehicle dataset here. Don't need to change any default parameters in any scripts.]

First specify GPU ids (we can just use CPUs for Vehicle with a linear SVM)

export CUDA_VISIBLE_DEVICES=

Then go to the fair_flearn directory, and start running:

bash run.sh $dataset $method $data_partition_seed $q $sampling_device_method | tee $log

For Vehicle, $dataset is vehicle, $data_partition_seed can be set to 1, q is 0 for FedAvg, and 5 for q-FedAvg (the proposed objective). For sampling with weights proportional to the number of data points, $sampling_device_method is 2; for uniform sampling (one of the baselines), $sampling_device_method is 1. The exact command lines are as follows.

(1) Experiments to verify the fairness of the q-FFL objective, and compare with uniform sampling schemes:

mkdir log_vehicle
bash run.sh vehicle qffedavg 1 0 2 | tee log_vehicle/ffedavg_run1_q0
bash run.sh vehicle qffedavg 1 5 2 | tee log_vehicle/ffedavg_run1_q5
bash run.sh vehicle qffedavg 1 0 1 | tee log_vehicle/fedavg_uniform_run1

Plot to re-produce the results in the manuscript:

(we use seaborn to draw the fitting curves of accuracy distributions)

pip install seaborn
python plot_fairness.py

We can then compare the generated fairness_vehicle.pdf with Figure 1 (the Vehicle subfigure) and Figure 2 (the Vehicle subfigure) in the paper to validate reproducibility. Note that the accuracy distributions reported (both in figures and tables) are the results averaged across 5 different train/test/validation data partitions with data parititon seeds 1, 2, 3, 4, and 5.

(2) Experiments to demonstrate the communication-efficiency of the proposed method q-FedAvg:

bash run.sh vehicle qffedsgd 1 5 2 | tee log_vehicle/ffedsgd_run1_q5

Plot to re-produce the results in the paper:

python plot_efficiency.py

We can then compare the generated efficiency_qffedavg.pdf fig with Figure 3 (the Vehicle subfigure) to verify reproducibility.

Run on other datasets

  • First, config run.sh based on all hyper-parameters (e.g., batch size, learning rate, etc) reported in the manuscript (appendix B.2.3).
  • If you would like to run on Sent140, you also need to download a pre-trained embedding file using the following commands (this may take 3-5 minutes):
cd fair_flearn/flearn/models/sent140
bash get_embs.sh
  • We use different models for different datasets, so you need to change the model name specified by --model. The corrsponding model associated with a dataset is described in fair_flearn/models/$dataset/$model.py. For instance, if you would like to run on the Shakespeare dataset, you can find the model name under fair_flearn/models/shakespeare/, which is stacked_lstm, and pass this parameter to --model='stacked_lstm'.
  • You also need to specify total communication rounds using --num_rounds. Suggested number of rounds based on our previous experiments are:
Vehicle: default
synthetic: 20000
sent140: 200
shakespeare: 80
fashion mnist: 6000
adult: 600

For fairness and efficiency experiments, we use four datasets: Vehicle, Sythetic, sent140 and Shakespeare. method can be chosen from [qffedavg, qffedsgd]. $sampling is 2 (with weights of sampling devices proportional to the number of local data points).

mkdir log_$dataset
bash run.sh $dataset $method $seed $q $sampling | tee log_$dataset/$method_run$seed_q$q

In particular, $dataset can be chosen from [vehicle, synthetic, sent140, shakespeare], in accordance with the data directory names under the fair_flearn/data/ folder.

Compare with AFL. We compare wtih the AFL baseline using the two datasets (samplaed Fashion MNIST and Adult) following the AFL paper.

  • Generate data. (data generation process is as described above)
  • Specify parameters. method should be specified to be afl in order to run AFL algorithms. data_partition_seed should be set to 0, such that it won't randomly partition datasets into train/test/validation splits. This allows us to use the same standard public testing set as that in the AFL paper. track_individual_accuracy should be set to 1. Here is an example run.sh for the Adult dataset:
python3  -u main.py --dataset=$1 --optimizer=$2  \
            --learning_rate=0.1 \
            --learning_rate_lambda=0.1 \
            --num_rounds=600 \
            --eval_every=1 \
            --clients_per_round=2 \
            --batch_size=10 \
            --q=$4 \
            --model='lr' \
            --sampling=$5  \
            --num_epochs=1 \
            --data_partition_seed=$3 \
            --log_interval=100 \
            --static_step_size=0 \
            --track_individual_accuracy=1 \
            --output="./log_$1/$2_samp$5_run$3_q$4"

And then run:

bash run.sh adult qffedsgd 0 5 2 | tee log_adult/qffedsgd_q5
bash run.sh adult afl 0 0 2 | tee log_adult/afl
  • You can find the accuracy numbers in the log files log_adult/qffedsgd_q5 and log_adult/afl, respectively.

References

See our Fair Federated Learning manuscript for more details as well as all references.

fair_flearn's People

Contributors

litian96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fair_flearn's Issues

qffedavg.py doesn't match with Alg 2.

Hello,

I am not sure about this, but it seems that Alg. 2 (In the paper) does not match exactly the implementation in [flearn/trainers/qffedavg.py](https://github.com/litian96/fair_flearn/blob/master/flearn/trainers/qffedavg.py). Specifically, in Line 62 you have

hs.append(self.q * np.float_power(loss+1e-10, (self.q-1)) * norm_grad(grads) + (1.0/self.learning_rate) * np.float_power(loss+1e-10, self.q))

While Alg. 2 suggests that you should have

hs.append(self.q * np.float_power(loss+1e-10, (self.q-1)) * np.power(norm_grad(grads), 2) + (1.0/self.learning_rate) * np.float_power(loss+1e-10, self.q))

Variance or weighted variance to measure the fairness?

I see you have the fairness definition according to the variance of accuracies across m devices.

However, the variance you compared

variance[i] = np.var(accuracies[i]) * 10000

has the assumption that the accuracies are equally like the same.

My question is,
instead of using the standard the variance, why not calculate the variance in a weighted fashion?

A revised variance, either in the weights of the sample distributions or the q-federated weights makes much more sense to me.
Is that true?

the experiment (1) output files is empty

Hello, thanks for sharing your wonderful codes.
My host is based on Windows system. I try to reproduce the experiment by using the README file in your code experiment : (1) Experiences to verify the fairness of the q-FFL objective, and compare with uniform sampling schemes. However, after I execute the run.sh file, the output is empty. I checked my execution steps and found no problem at present. I hope you can confirm the cause of the problem for me.
Thanks again!

fig2

The names of the output files are: fedavg_uniform_run1、ffedavg_run1_q0、ffedavg_run1_q5
See the figure below:

fig1

Theoretical analysis of q-FFL

Dear author
I have a question on the theoretical analysis of q-FFL that I may not be understanding due to a lack of my knowledge. In first equation in the proof of lemma 7, can you explain the inequality? I mean how does this hold if I cancel out the identical terms from both sides. We may look into the attachment for the clarification of my question.
Screen Shot 2020-07-10 at 4 11 36 PM

Also the same question for lemma 8. How does inequality hold in the marked section of the image? To me it seems like it contradicts the inequality condition of lemma 7.
Screen Shot 2020-07-10 at 4 11 55 PM

An issue about the function flearn.utils.model_utils.project

Dear author, sorry for bothering you again.
When I use optimizer 'afl', I notice the function 'flearn.utils.model_utils.project' sometimes returns a result not in the probability simplex.
image
image
Then I read the article "Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application" and compare the algorithm with your code. I think 'np.asarray(u)[:i]' should be replaced by 'np.asarray(u)[:i+1]' in the red line.
image
image
And after this change, it projects a result in the probability simplex.
image
image

q-FedSGD update

Dear author, thanks for sharing your wonderful codes.

In the ICLR paper Algorithm 1, the server updates w by (sum of deltas) / (sum of hs). As I understand from the paper, the h_k^t's are the upper bounds of the gradient on each device. So why is the summation not taken over the (delta_k^t / h_k^t)? What is the rationale in aggregating all the upper bounds and use the sum as a denominator?

Why the argument 'replacement' is set False as default when sampling?

Dear author, thanks for sharing your wonderful codes.
I'm curious about the reason why the argument 'replacement' is set 'False' as default when sampling clients with proportion pk. And I've noticed that this item is usually set 'True' in other papers, like “FEDERATED LEARNING’S BLESSING: FEDAVG HAS LINEAR SPEEDUP" and "On the convergence of Fedavg on non-i.i.d. data”.
image
image
I want to know whether it matters in experiments.

An question about AFL algorithm

Hello, thanks for sharing your wonderful codes.
Recently I’m trying to reproduce the results in the paper FAIRRESOURCEALLOCATION INFEDERATED LEARNING. While plotting the right figure on Figure 9(q-FFL is more efficient than AFL), I failed to get a similar plot as given. No matter how I tune the parameters, the lowest accuracy among all devices remains nearly 20% for AFL algorithm even after 80 rounds of training. Could you please tell me how to set the two parameters $\gamma_w$ and $\gamma_\lambda$ ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.