Giter Club home page Giter Club logo

kotlinsyft's Introduction

KotlinSyft-logo

License Tests build Coverage OpenCollective Chat on Slack Download

All Contributors

KotlinSyft

KotlinSyft makes it easy for you to train and inference PySyft models on Android devices. This allows you to utilize training data located directly on the device itself, bypassing the need to send a user's data to a central server. This is known as federated learning.

  • โš™๏ธ Training and inference of any PySyft model written in PyTorch or TensorFlow
  • ๐Ÿ‘ค Allows all data to stay on the user's device
  • โšก Support for full multi-threading / background service execution
  • ๐Ÿ”‘ Support for JWT authentication to protect models from Sybil attacks
  • ๐Ÿ‘ A set of inbuilt best practices to prevent apps from over using device resources.
    • ๐Ÿ”Œ Charge detection to allow background training only when device is connected to charger
    • ๐Ÿ’ค Sleep and wake detection so that the app does not occupy resource when user starts using the device
    • ๐Ÿ’ธ Wifi and metered network detection to ensure the model updates do not use all the available data quota
    • ๐Ÿ”• All of these smart defaults are easily are overridable
  • ๐ŸŽ“ Support for both reactive and callback patterns so you have your freedom of choice (in progress)
  • ๐Ÿ”’ Support for secure multi-party computation and secure aggregation protocols using peer-to-peer WebRTC connections (in progress).

There are a variety of additional privacy-preserving protections that may be applied, including differential privacy, muliti-party computation, and secure aggregation.

OpenMined set out to build the world's first open-source ecosystem for federated learning on web and mobile. KotlinSyft is a part of this ecosystem, responsible for bringing secure federated learning to Android devices. You may also train models on iOS devices using SwiftSyft or in web browsers using syft.js.

If you want to know how scalable federated systems are built, Towards Federated Learning at Scale is a fantastic introduction!

Installation

KotlinSyft is available on maven and jcenter. To add the library as a dependency in your android project use one of the following methods:

  1. Maven snippet:
<dependency>
  <groupId>org.openmined.kotlinsyft</groupId>
  <artifactId>syft</artifactId>
  <version>0.1.3</version>
  <type>pom</type>
</dependency>
  1. Gradle dependency:
implementation 'org.openmined.kotlinsyft:syft:0.1.3'

Quick Start

As a developer, there are few steps to building your own secure federated learning system upon the OpenMined infrastructure:

  1. ๐Ÿค– Generate your secure ML model using PySyft. By design, PySyft is built upon PyTorch and TensorFlow so you don't need to learn a new ML framework. You will also need to write a training plan (training code the worker runs) and an averaging plan (code that PyGrid runs to average the model diff).
  2. ๐ŸŒŽ Host your model and plans on PyGrid which will deal with all the federated learning components of your pipeline. You will need to set up a PyGrid server somewhere, please see their installation instructions on how to do this.
  3. ๐ŸŽ‰ Start training on the device!

๐Ÿ““ The entire workflow and process is described in greater detail in our project roadmap.

You can use KotlinSyft as a front-end or as a background service. The following is a quick start example usage:

    val userId = "my Id"

    // Optional: Make an http request to your server to get an authentication token
    val authToken = apiClient.requestToken("https://www.mywebsite.com/request-token/$userId")

    // The config defines all the adjustable properties of the syft worker
    // The url entered here cannot define connection protocol like https/wss since the worker allots them by its own
    // `this` supplies the context. It can be an activity context, a service context, or an application context.
    val config = SyftConfiguration.builder(this, "www.mypygrid-url.com").build()

    // Initiate Syft worker to handle all your jobs
    val syftWorker = Syft.getInstance(authToken, configuration)

    // Create a new Job
    val newJob = syftWorker.newJob("mnist", "1.0.0")

    // Define training procedure for the job
    val jobStatusSubscriber = object : JobStatusSubscriber() {
        override fun onReady(
            model: SyftModel,
            plans: ConcurrentHashMap<String, Plan>,
            clientConfig: ClientConfig
        ) {
            // This function is called when KotlinSyft has downloaded the plans and protocols from PyGrid
            // You are ready to train your model on your data
            // param model stores the model weights given by PyGrid
            // param plans is a HashMap of all the planIDs and their plans.
            // ClientConfig has hyper parameters like batchsize, learning rate, number of steps, etc

            // Plans are accessible by their plan Id used while hosting it on PyGrid.
            // eventually you would be able to use plan name here
            val plan = plans["plan name"]

            repeat(clientConfig.properties.maxUpdates) { step ->

                // get relevant hyperparams from ClientConfig.planArgs
                // All the planArgs will be string and it is upon the user to deserialize them into correct type
                val batchSize = (clientConfig.planArgs["batch_size"]
                                 ?: error("batch_size doesn't exist")).toInt()
                val batchIValue = IValue.from(
                    Tensor.fromBlob(longArrayOf(batchSize.toLong()), longArrayOf(1))
                )
                val lr = IValue.from(
                    Tensor.fromBlob(
                        floatArrayOf(
                            (clientConfig.planArgs["lr"] ?: error("lr doesn't exist")).toFloat()
                        ),
                        longArrayOf(1)
                    )
                )
                // your custom implementation to read a databatch from your data
                val batchData = dataRepository.loadDataBatch(clientConfig.batchSize)
                //get Model weights and return if not set already
                val modelParams = model.getParamArray() ?: return
                val paramIValue = IValue.listFrom(*modelParams)
                // plan.execute runs a single gradient step and returns the output as PyTorch IValue
                val output = plan.execute(
                    batchData.first,
                    batchData.second,
                    batchIValue,
                    lr,paramIValue
                )?.toTuple()
                // The output is a tuple with outputs defined by the pysyft plan along with all the model params
                output?.let { outputResult ->
                    val paramSize = model.modelState!!.syftTensors.size
                    // The model params are always appended at the end of the output tuple
                    val beginIndex = outputResult.size - paramSize
                    val updatedParams =
                            outputResult.slice(beginIndex until outputResult.size)
                    // update your model. You can perform any arbitrary computation and checkpoint creation with these model weights
                    model.updateModel(updatedParams.map { it.toTensor() })
                    // get the required loss, accuracy, etc values just like you do in Pytorch Android
                    val accuracy = outputResult[0].toTensor().dataAsFloatArray.last()
                }
            }
            // Once training finishes generate the model diff
            val diff = mnistJob.createDiff()
            // Report the diff to PyGrid and finish the cycle
            mnistJob.report(diff)
        }

        override fun onRejected() {
        // Implement this function to define what your worker will do when your worker is rejected from the cycle
        }

        override fun onError(throwable: Throwable) {
        // Implement this function to handle error during job execution
        }
    }

    // Start your job
    newJob.start(jobStatusSubscriber)

    // Voila! You are done.

Running the Demo App

The demo app fetches the plans, protocols and model weights from pygrid server hosted locally. The plans are then deserialized and executed using libtorch.

Follow these steps to setup an environment to run the demo app:

  • Clone the repo PyGrid and change directory to it. At the moment PyGrid doesn't have official releases so please use this commit
git clone https://github.com/OpenMined/PyGrid
cd PyGrid
git checkout 0e93aa645a63a02f45ae72b4ff3106c6402dbadf
  • Follow PyGrid: getting started to run a local instance of PyGrid Node

  • Install PySyft at commit 9d4f8e3ebecc4a00428607403832c5628753f1fc in the virtual environment.

git clone https://github.com/OpenMined/PySyft
cd PySyft
git checkout 9d4f8e3ebecc4a00428607403832c5628753f1fc
virtualenv -p python3 venv
source venv/bin/activate
make venv
  • From PySyft folder, start Jupyter Notebook
jupyter notebook
  • Open a browser and navigate to localhost:8888. You should be able to see the PySyft files.
  • In the Jupyter Notebook, navigate to examples/tutorials/model-centric-fl
  • Run the notebook Part 01 - Create Plan.ipynb. It should host the model on PyGrid.
  • Optionally, run the notebook Part 02 - Execute Plan.ipynb. This will train the model on the python worker of PySyft.
  • The android app connects to your PC's localhost via router (easier approach)
  • Get the IP address of your computer by running ip address show | grep "inet " | grep -v 127.0.0.1 if using Linux/Mac. For windows there are different steps. Alternatively, if you want to run the demo app in the emulator, use 10.0.2.2 as the IP address.
  • Use this IP address and the port (default:5000) in your login screen to supply the PyGrid server url, e.g., 10.0.2.2:5000

Built on KotlinSyft

Wearable UI

Federated-Wearables is a demo app for cross-device federated learning over wearables. The smartwatch app offloads the collected data to the paired phone app via Bluetooth for training in FL, and gets the updated model to run interface in real time.

Contributing

  1. Star, fork, and clone the repo
  2. Open Android Studio and import project
  3. Do your work.
  4. Push to your fork
  5. Submit a PR to OpenMined/KotlinSyft

Read the contribution guide as a good starting place. Additionally, we welcome you to the slack for queries related to the library and contribution in general. The Slack channel #lib_kotlin_syft is specific to KotlinSyft development, the Slack channel #lib_syft_mobile is meant for both Android and iOS teams. See you there!

Contributors

These people were integral part of the efforts to bring KotlinSyft to fruition and in its active development.


varun khare

๐Ÿ’ป โš ๏ธ ๐Ÿ“– ๐ŸŽจ ๐Ÿš‡

Jose A. Corbacho

๐Ÿ’ป โš ๏ธ ๐ŸŽจ ๐Ÿ“– ๐Ÿš‡

Ravikant Singh

๐Ÿ’ป ๐Ÿ“–

Saksham Rastogi

๐Ÿ“–

Patrick Cason

๐Ÿ“– ๐Ÿ’ผ

Mohammed Galalen

๐Ÿ“– โš ๏ธ

Erik Ziegler

๐Ÿ›

Pengyuan Zhou

โœ… ๐Ÿš‡

License

Apache License 2.0

kotlinsyft's People

Contributors

allcontributors[bot] avatar cereallarceny avatar codeboy5 avatar galalen avatar github-actions[bot] avatar hdodenhof avatar mccorby avatar pengyuan-zhou avatar rav1kantsingh avatar vkkhare avatar vvmnnnkv avatar zzaebok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kotlinsyft's Issues

Document Protobuf serialization

the files present in https://github.com/OpenMined/KotlinSyft/tree/dev/syftlib/src/main/java/org/openmined/syft/proto all are basically kotlin wrappers around the protobuf generated classes in syft-proto

We need inline comments here to make it easier to understand the functionality of the class

auto update version for publishing library

We have a static version string used in publish.gradle at the moment. It would be better if we shift to auto bump version number using nebula release or any other suitable platform.

The version number should ideally get extracted from git tags determining stable/rc category

Add support for background task scheduling in Android

In order to properly execute training plans, we must do so in a background task. This allows for training to take place without a visual API (as a library of another app), and do so separate from the main thread.

Execute training of plan

We should be able to run a demo training cycle with without taking the model params from pygrid

Auto initialise networking clients

Currently SocketClient and HttpClient are injected into the syft object but rather they should be automatically created by the syft constructor so that the user code aligns with the roadmap

Create the syft configuration class

This class would provide the base methods/configs for downloading, training and uploading to ensure the developer using KotlinSyft adheres to best practices to prevent the owner of the mobile from getting drained out of his battery/monetary resources.

Open both the reactive observable and callbacks to jobs

Currently we have a JobStatusSubscriber class which we want the user to implement. If the user code is following reactive pattern he might wanna use the observable directly rather than the callbacks. We should hence allow for both like retrofit

Create a demo app utilising background services

Currently all the training occurs in the foreground. For us to implement sleep wake detection api, we need a service running the training job.

The background service must be capable of instantiating atleast 2 threads

  • one for network requests and keeping the socket connection open
  • one for training

Use Connectable observable for signalling client

A better way is to allow the user to have access to observer and let him subscribe to it as per his use case. This will be a more reactive approach for the user of lib as well.

Additionally this will allow us to have parallel subscriptions to socket channel. But, we would need to decide when to dispose the connections as some subscribers would be just UI based subscribers who will have less priority in deciding whether the connection to pygrid should exist or not. So a simple rxjava:refcount wont work for us

Write unit tests for the syftlib

With the new subscriber pattern being used everywhere, old tests were no longer valid. We need to rewrite the tests for library.
these class tests need to be written for this issue to finish

  • JobStatusSubscriber

  • SocketClient

  • WebRTC Client

  • WebSocketClient

  • DeviceMonitor

  • Plan

  • SyftJob

  • ResponseRequestType

Document Syftjob.kt

SyftJob.kt is responsible for running the training cycles.

  • special emphasis must be given on the usage of networking and compute schedulers and how they affect threading capabilities
  • downloaders need to be described
  • the main workflow of how the job status subscriber calls the relevant user-defined functions
  • when does request key get initialized

Implement sleep/wake detection demo in Android

While we want to allow the end-user developer integrating KotlinSyft into their application the ability to choose to execute models while the user is asleep, we don't want to force them into this paradigm.

This issue includes developing a basic "asleep or awake" algorithm that a developer can choose to use. It's worth noting that the default option for a syft client would be to "enable" this as a requirement for training. It's up to the developer to state that they want this option "disabled".

Add support for charge detection and wifi detection in Android

While we want to allow the end-user developer integrating KotlinSyft into their application the ability to choose to execute models while the user is charging their phone, we don't want to force them into this paradigm.

This issue includes developing charge detection that a developer can choose to use. It's worth noting that the default option for a syft client would be to "enable" this as a requirement for training. It's up to the developer to state that they want this option "disabled".

Add bandwidth and Internet connectivity test in Android

We need to have some sort of way to run a basic bandwidth and Internet connectivity test in Android so that we may submit these values to PyGrid. This allows PyGrid to properly select candidates for pooling based on internet connection speed. This does not check for wifi connectivity. This will be included in a separate issue.

We must determine the average ping, upload speed, and download speed of the device and report these values to PyGrid.

Execute plans in Android

This epic issue is somewhat self-explanatory, but in theory, we need to be able to execute a PySyft plan. This should ideally only be done after the API has been finalized in the Android worker (#23).

  • implement retrofit HTTP #46
  • download plan via Http
  • Generate torchscript file from plan
  • Run the torchscript in compute thread #69
  • Submit the diff to PyGrid

Implement Protobuf classes in Android

We need to add the following Protobuf classes to KotlinSyft as they are completed:

  • Plan
  • State
  • Operation
  • Placeholder
  • TorchParameter
  • Protocol
  • PromiseTensor

Stories associated to this epic:

  • Infrastructure #43

Document SocketClient.kt

SocketClient.kt is responsible for providing retrofit like observable socket connections. While retrofit handles HTTP network calls, socket client handles listening to socket messages and providing the relevant subscriber.

  • A focus should be given to how socket client listens to the communication channel and creates an observable listening to only the correct response category via API endpoints.

  • Processor description is also needed

Implement Signaling client for WebRTC

For now we will use native socket API for the signaling client. If the need arises for complex handling of app life cycles, we might later use external libraries

Migrate Kotlin serialization to Gson

Kotlin serialization is still under heavy development and needs a few more API calls for easy usage. For now we will be shifting to Gson for handling JSON objects

Exponential backoff for socket connections

Currently we are reconnecting the socket as soon as it fails. This is far from ideal since there can be a problem on the other side and by hitting it constantly we would be making things worse.

Implement an exponential backoff strategy to avoid this problem.

Document setup for running demo app

For running demo app we need to setup pygrid and pysyft. The following steps should be present in the Readme.md

For socket error, you need to follow these steps to setup pygrid as well (Bear with me for the long procedure):

  1. clone the repo https://github.com/OpenMined/PyGrid
  2. cd Pygrid
  3. install docker https://github.com/OpenMined/PyGrid/#getting-started
  4. install docker-compose
  5. run docker-compose up
  6. clone the repo https://github.com/OpenMined/PySyft
  7. cd Pysyft
  8. virtualenv -p python3 venv
  9. make venv
  10. make notebook
  11. go to http://localhost:8888/
  12. in the jupyter notebook,navigate to examples/experimental/FL Training Plan
  13. Run the notebooks createplan. It should save three files in the FL Training Plan folder
  14. Run the notebook Host Plan. Now the Pygrid is setup and the model is hosted over it.
  15. The android app connects to your PC's localhost via router (easier approach)
  16. Get the IP address of your PC by running ip address show | grep "inet " | grep -v 127.0.0.1 if using Linux. For windows there are different steps.
  17. Fill the ip address here

Originally posted by @vkkhare in #61 (comment)

Document plan execution

The PR #68 implements the Plan.execute function which takes the training data, model params and client config as parameters.

Add a stopping method that stops the training process in Android

We need to have a stopping method that will terminate the current job in question. Reasons for stopping training could be any of the following:

  • The user wanted to... like they clicked a "stop" button
  • The plan has an error and can't execute
  • The user started using their device again
  • The device loses wifi
  • The device loses active charging
  • Or perhaps most importantly... if the model isn't really going anywhere (the error rate isn't going down)

At this point, we should stop and notify the user with some sort of message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.