adap / flower Goto Github PK
View Code? Open in Web Editor NEWFlower: A Friendly Federated Learning Framework
Home Page: https://flower.ai
License: Apache License 2.0
Flower: A Friendly Federated Learning Framework
Home Page: https://flower.ai
License: Apache License 2.0
The Flower docs contain only a general overview of the available baselines. It would be great to document individual baselines more thoroughly and have detailed descriptions on how to execute/reproduce them.
Should we consider using a different theme structure for the menu in the website? Something like
Quickstrart
|--Keras tensorflow
|-- Pytorch
Installation
API
Examples
Cloud Usage
And maybe a different Sphinx theme? I found this: https://sphinx-themes.org/
Flower provides a few popular FL algorithms out-of-the-box. Those implementations along with their configuration parameters should be documented to help users understand what's already available.
Currently, examples are located in src/flwr_example/...
. The extras required by those examples are mentioned in the general pyproject.toml
. This structure makes it difficult to "just copy and paste" examples.
A better approach would be to move examples into a top-level examples
directory and treat every example as a standalone project (i.e., give each example its own pyproject.toml
).
Document the general architecture of Flower:
Hello !
I'm trying to use the example you provide (quickstart and tensorflow). I achieve to train models but clients can't achieve to stop themselves without crashing.
I just run run_server.sh
and run_clients.sh
in separate terminals, see the clients downloading data and train their models. After training, the server evaluate the model ant stop itself properly.
At this moments, clients crash by rising an exception with this message :
Traceback (most recent call last):
File "client.py", line 114, in <module>
main()
File "client.py", line 110, in main
fl.client.start_keras_client(args.server_address, client)
File "/usr/local/lib/python3.7/dist-packages/flwr/client/app.py", line 47, in start_keras_client
start_client(server_address, flower_client)
File "/usr/local/lib/python3.7/dist-packages/flwr/client/app.py", line 35, in start_client
server_message = receive()
File "/usr/local/lib/python3.7/dist-packages/flwr/client/grpc_client/connection.py", line 59, in <lambda>
receive: Callable[[], ServerMessage] = lambda: next(server_message_iterator)
File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 416, in __next__
return self._next()
File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 706, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1601383903.613729459","description":"Error received from peer ipv6:[::]:8080","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}"
rnd
might be mistaken for random, thus we should perhaps rename it to fl_round
(other suggestions are welcome)
Note: this would be an incompatible change.
To support more heterogeneous environments/setups it would be great to have JavaScript/TypeScript Client SDK. The task shouldn't be to hard as gRPC-Web improved substantially. We are happy to support if someone wants to tackle this issue and ideally provide an example with e.g. TensorFlow Lite.
A potential use-case could be improving an image classification model as described in the TF Lite docs
Define and document the release process.
Topics include:
Rename the class Setting
in
to Baseline
.
This will make it easier to understand at various places.
Upgrade all torch dependency to 1.6 (torchvision 0.7) which hopefully improves PyTorch-related mypy type checks.
Python packages are currently located under src
, along with ProtoBuf definitions in src/proto
.
To have a clean structure for upcoming SDKs in other languages (Java, Swift, C++, ...), the Python packages should be moved under src/py
. The resulting structure would enable other languages to be placed under src
in a clean way:
src/
cc/
proto/
py/
swift/
...
Once all issues required for beta status are resolved, update the PyPI classifier to 4 - Beta
.
Flower offers both server-side and client-side evaluation, but the ways to use it are not documented yet.
Create a Java Client SDK to make it easier to start with Java/Android.
C++ is one of the most defining programming languages of our time. It is used in many critical applications and the go-to language for performance-sensitive applications, such as robotics or automotive. Federated Learning can enable entirely new platforms in these domains and we thus want to support C++ by providing a Flower C++ SDK. Flower communicates between the server and the client using gRPC. At the moment, every C++ user needs to build their own integration with the gRPC message protocol to run Flower.
The C++ SDK needs to serialize model parameters (and other values that get communicated between client and server) in a way that can be de-serialized by Python on the server-side. ProtoBuf makes this easy for most values, but it might be helpful to build a small proof of concept for serializing/deserializing the model parameters. Flower represents model parameters as a list of byte arrays (think: the parameters of each layer in a neural network can be serialized to a single byte array). A PoC would then serialized these parameters in C++ and deserialized them in Python (and vice versa):
The full SDK implementation requires the following tasks:
Currently the federated datasets for the baselines are generated on demand in flwr_experimental/baseline/dataset and cached afterwards. The cache will be used if present.
We would like to move the datasets code into a different repository named e.g. federated-datasets and host the generated datasets for everyone to load using the PyPi package which downloads and caches the datasets. Especially in case of bigger datasets each client in the federated baseline experiment doesn't have to download the whole (original) dataset so this would be a significant improvement.
It would be great if someone wants to tackle this task.
The mypy configuration in mypy.ini
uses several flags to increase strictness.
To simplify this setup and automatically opt-in to upcoming strictness flags in future mypy versions, it would be preferable to replace those individual flags with the single strict = True
setting.
Create a new baseline for MNIST and FedAvg, as described in McMahan et al., 2017 (https://arxiv.org/abs/1602.05629):
The Flower docs contain only a general overview of the available examples. It would be beneficial (especially for first time users) to have a more detailed documentation for individual examples.
Currently, flower.dev
only shows the latest documentation. Documentation of older versions of Flower should remain available (users need to be able to switch between the documentation for different versions).
Flower enables developers and researchers to implement custom federated learning algorithms using the Strategy
interface. This interface should be documented with explanations on how to use it and a working example of a custom strategy implementation.
Hi everyone,
Currently I am working on a school project about federated learning and came across your framework during exploratory analysis. My project should utilize federated learning in this manner - I have an aggregation server (let's say in a cloud). I want this server to provide model to my 2 Raspberry PIs. These two RPIs would then train the model on a local data for x epochs and provide the trained models/gradients back to the global server. On this server, the results would be federated averaged and new model would be sent to the PIs. Is such a workflow possible with your framework? If so, could you provide me a hint?
Thank you,
Best regards
The Flower codebase itself is Python 3.9 ready. Some dependencies are however not yet Python 3.9 compatible, so we need to wait until those dependencies are ready.
Is it possible to provide some documentation/tutorials on how to change/customize the training strategy? Thanks!
The Flower docs would benefit from having a general "Federated Learning 101" which presents the basic ideas, concepts, and terminologies around federated learning.
Create an advanced tutorial for a use-case realised with TensorFlow.
A common baseline for FL is based on the Shakespeare dataset (e.g., McMahan et al., 2017).
Create an advanced tutorial for a use-case realised with PyTorch.
To support more heterogeneous environments/setups it would be great to have Flutter Client SDK. We are happy to support if someone wants to tackle this issue and ideally provide an example with e.g. TensorFlow Lite.
A potential use-case could be improving an image classification model as described in the TF Lite docs.
Currently when running the baseline we have to create a .flower_ops file to set various configs.We would like to remove the need for the config file as most of the settings in there could be automated.
The file contains
Path configuration
[paths]
wheel_dir = ~/some/path/flower/dist/
wheel_filename = flower-0.3.0-py3-none-any.whl
Which could be automatically obtained by a lookup in the dist directory.
AWS configs
[aws]
image_id = ami-123456789
key_name = flower
subnet_id = subnet-123456
security_group_ids = sg-123456789
logserver_s3_bucket = my_s3_bucket_name
flower
should be created/downloaded/locally made available.[ssh]
private_key = ~/.ssh/my_private_key
Here we could default to a key named flower
.
Finally all these are ideas/suggestions. If someone wants to tackle this we would be happy.
Docformatter automatically formats docstrings to follow a subset of the PEP 257 conventions. Add docformatter to the Flower project.
Federated averaging is being mentioned in the paper but not much about secured aggregation. What all privacy preserving and FL techniques are implemented?
Can the framework be deployed on institutions with different infrastructure? For example for healthcare institutions where the edge devices will be inside the institutions firewall.
Create a convenience script which configures a fresh Ubuntu system so that a default coding environment is created. The Purpose is to ease the onboarding process for new contributors.
Alternative ideas to evaluate:
Secure aggregation [Bonawitz et al., 2017] is an important element of many FL workloads. We need to add a way to use secure aggregation within different strategies, ideally in a modular way (i.e., there is one secure aggregation implementation which can easily be plugged into different strategy implementations).
Mode persistence is currently only possible via custom strategy implementations. There should be a better way to do save model checkpoints periodically, ideally in a modular way which is reusable across different strategy implementations.
I am interested in doing async FL using Flower. However, no async strategy is provided by Flower.
The Flower paper indicates that to change another strategy, we just need to implement a new Strategy
. However, I think server.py
is intrinsically synchronous, and not suitable for asynchronous strategies. In other words, to do asynchronous training, we need to change server.py
.
Consider the code block in server.py:fit
(I just show relevant lines):
def fit(self, num_rounds: int) -> History:
# ...
for current_round in range(1, num_rounds + 1):
# Train model and replace previous global model
weights_prime = self.fit_round(rnd=current_round)
if weights_prime is not None:
self.weights = weights_prime
# ...
def fit_round(self, rnd: int) -> Optional[Weights]:
# ...
results, failures = fit_clients(client_instructions)
return self.strategy.on_aggregate_fit(rnd, results, failures)
def fit_clients(client_instructions):
"""Refine weights concurrently on all selected clients."""
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [
executor.submit(fit_client, c, ins) for c, ins in client_instructions
]
concurrent.futures.wait(futures)
results: List[Tuple[ClientProxy, FitRes]] = []
failures: List[BaseException] = []
for future in futures:
failure = future.exception()
# ...
return results, failures
fit_round
doesn't update model until fit_clients
collects all results/failures. However, in async FL, the model should be updated whenever the server receives a computation result from a client (Reference: Asynchronous Federated Optimization by Xie et al.). So we need to change server.py
to do async FL.
Am I missing something? Is there a way to do async FL without changing server.py
but only implement a new Strategy
? All helps will be appreciated.
Upgrade:
I cannot run
$ ./src/py/flwr_example/quickstart_pytorch/run-server.sh
/usr/bin/python3: Error while finding module specification for 'flwr_example.quickstart_pytorch.server' (ModuleNotFoundError: No module named 'flwr_example.quickstart_pytorch')
But I can run the other example
$ ./src/py/flwr_example/pytorch/run-server.sh
Update Poetry version to 1.1.4
Hi,
I am currently trying out the flower framework under pytorch.
I am very surprised how well it works.
One thing is still unclear to me, after the federated-learning process is over, i would like to save the new global model parameters on the clients, after the server distribute them to all clients.
How is that possible, or where to implement them, if not already done?
And why is min_fit_clients and min_eval_clients in fedavg.py set to 2 and not 1, is there a special reason?
Greetings
Patrick
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.