openmined / syft.js Goto Github PK

The official Syft worker for Web and Node, built in Javascript

License: Apache License 2.0

JavaScript 96.81% Shell 0.11% Python 3.08%

deep-learning hacktoberfest javascript privacy syft typescript

syft.js's Introduction

OpenMined Web Monorepo

Welcome to the OpenMined web monorepo, the home of all of OpenMined's many websites. Below are some basic instructions for getting this repository running on your machine.

Support

If you're looking for support about the courses, please go the Courses Discussion Board. If you've found a bug, or have a suggestion for an improvement to the Courses site, or any of our websites, please file an issue here.

Contributing

We are currently only accepting bug fixes from our community at the moment. If you're interested in working on these sites regularly as part of a team, please DM @Patrick Cason on Slack with your resume and qualifications.

Local Setup

Make sure that you have Node, NPM, and Yarn installed on your machine.
Install NX, our monorepo management framework.
From this point forward, you will run all commands in the root folder. Start by running yarn install to install all dependencies.
Run one of the below commands, depending on what you're trying to do... note that the third word in the command corresponds to the app in question. For instance, yarn start courses will run the courses app, located at apps/courses.

Courses

The OpenMined Courses website where we host our educational material. The site is a React.js web application, running on a Firebase backend, Jest for testing, Cypress for end-to-end testing, and using Sanity.io as the content management system (CMS).

yarn start courses - Runs the courses site with hot reloading for development purposes.
yarn lint courses - Runs the linter for the courses site
yarn test courses - Runs the test suite for the courses site
yarn build courses - Builds the courses site
yarn build courses --prod - Builds a production version of the courses site
yarn analyze courses - Analyzes the file sizes and distribution of a built version of the courses site

Courses E2E Testing

The OpenMined Courses website uses Cypress for end-to-end-testing. You have access to the following commands:

yarn e2e courses-e2e - Runs all the end-to-end tests for the Courses website
yarn lint courses-e2e - Runs the linter for the courses end-to-end app

Firebase API

Firebase is the primary backend for all of OpenMined's websites. If you want to test any functions or security rules before pushing them live, you may do so using the emulator suite.

yarn test firebase-api - Runs all the tests for the our Firebase backend

Sanity CMS

Sanity is the primary CMS for all of OpenMined's websites. You must have a user account to change any actual values, however, if you want to run it on your machine, you have access to the following commands:

yarn start sanity-api - Runs the Sanity CMS with hot reloading for development purposes.
yarn lint sanity-api - Runs the linter for the Sanity CMS
yarn test sanity-api - Runs the test suite for the Sanity CMS
yarn build sanity-api - Builds the Sanity CMS
yarn build sanity-api --prod - Builds a production version of the Sanity CMS
yarn analyze sanity-api - Analyzes the file sizes and distribution of a built version of the Sanity CMS

syft.js's People

Contributors

Stargazers

Watchers

syft.js's Issues

Improve build system

We need to allow for the Syft.js build system to be compiled and usable (via NPM install). Currently the package is published here: https://www.npmjs.com/package/syft.js. We need to be able to include it in the following formats:

Methods of importing

ES6/7 import:

import syft from 'syft.js';

Require:

const syft = require('syft.js');

Script include:

<script src="/node_modules/syft.js/lib/index.js"></script>

Referencing in Javascript

const mySyft = new syft(url);

Removal of first tensor throws error upon removal

Whenever we remove the first tensor to be added, it returns an error. We should be able to remove any tensor in any order, regardless of how many are present.

Scaffold basic proposed worker API in syft.js

The following is the proposed worker API in syft.js: https://github.com/OpenMined/Roadmap/blob/master/web_and_mobile_team/projects/federated_learning.md#4-execute

Migrate syft.js from Travis to Github Actions

We're switching from using Travis to Github actions (OpenMined/PySyft#3013). Please migrate this project to do so as well. It's important to also ensure that pull requests cannot be merged without an all-green CI report.

Allow for training state to be persisted to temporary storage in the event of a failure in syft.js

In the event that training is interrupted on a device or computer, we need to have a "saved state" of the training process that allows us to resume. Such interruptions could be:

An error in the training
User decides to intentionally pause training

Add bandwidth and Internet connectivity test in syft.js

We need to have some sort of way to run a basic bandwidth and Internet connectivity test in syft.js so that we may submit these values to PyGrid. This allows PyGrid to properly select candidates for pooling based on internet connection speed.

We must determine the average ping, upload speed, and download speed of the device and report these values to PyGrid.

Clean up MNIST example

Description

The MNIST example currently in the project could use some minor improvements. We need to do the following:

Remove references to grid.js
Remove references to "with-grid" and make everything labeled "MNIST"
Remove the current description and replace it with something describing the demo:

This is a demonstration of how to use syft.js with PyGrid to train a plan on local data in the browser.

Are you interested in working on this improvement yourself?

No, this is for @vvmnnnkv

Additional Context

None

Travis Support for Unit Tests

In Issue #11 , we created a basic Unit Testing Suite... in this issue, we want to support automated unit testing using Travis (with integration to Github) so that all submitted pull requests get tested automatically.

This should also include a travis build badge on the Readme.

Add a description to the repository

Suggestions

Description: Private Deep Learning in JavaScript
Topics: Deep Learning; Privacy
Website: https://openmined.org

Convert Logger class to be a singleton

Right now, we have the Logger class as a class that you can create, but then you have to pass it around the library, or instantiate a new version each time. It would be a lot easier to convert this to be a singleton class. This would allow us to "instantiate" the Logger class multiple times, but only have one true reference to the first one we created. This also has a huge added benefit of not having to pass the original Logger instance (located in the syft.js file) around to each of the other files... annoying!

This is a really good first issue, any takers?

Migrate syft.js to use Protobuf classes

We need to add the following Protobuf classes to syft.js as they are completed:

Deserialize from protobuf:

Serialize to protobuf:

Change all instances of Logger through the app to be Singleton model

We don't need to pass the Logger class as an argument to functions anywhere. Simply create this using new Logger() wherever needed. Change this in all associated files that use the Logger class

Migrate all Serde tests to use live data from grid.js seed generator

The syft.js project could use a lot of help refining our current test suite.

Currently, all of our tests for Serde in Javascript run based on strings that we occasionally copy and paste from PySyft. However, these strings are now incredibly out of date and we're finding a lot of problems as the PySyft team changes things periodically. We need to figure out some sort of way to utilize the grid.js seed file generator (maybe by building one of our own in syft.js?) to automate the creation of these Serde simplified strings. In theory, this would notify us every time syft.js becomes out of date with PySyft... aka, insanely helpful.

This issue is a bit non-descript, so if you're interested in helping to write tests, please contact me (@cereallarceny) on Slack and I'll get you started!

Create an auto-generated API reference documentation

Where?

The entire project

Who?

Developers looking to implement the library in their application.

What?

We want to provide automatically generated API reference documentation to allow developers to use our library in their application to be able to read all publicly available methods and their signatures. This documentation is purely meant to be functional. It does not need to look pretty, but it does need to be very organized. Here is an example of what you should aim to reproduce.

Additional Context

Generate this file to the root of the project located at /API-REFERENCE.md
Set it to run the regeneration function on every commit and add it to the current commit
Add a reference to this file from the Usage section of the main readme

[syft.js] Force all insecure websocket connections to be secure

We need to ensure that all connections to a grid server are done via wss and NEVER ws. This could be as simple as including a string check against url supplied by the app.

Create syft.js tensor using PySyft over Web Sockets

This project combines: #6 and #7 . #6 is about being able to translate JSON messages into tensors. #7 is about being able to receive JSON messages from PySyft.

In this issue, we want to glue #6 and #7 together by being able to receive a JSON message from PySyft and initialize a tensor locally (logging it to the console)

Demo Website

In a folder "demo", create a demo website which imports syft.js and executes a function contained within syft.js. This website obviously won't do much yet (except import a library) but the idea is that it will eventually grow into our Federated Learning App.

Test all major features

Description

We want to test all major features of the federated learning process in this repo. This includes:

Type of Test

Unit test (e.g. checking a loop, method, or function is working as intended)
Integration test (e.g. checking if a certain group or set of functionality is working as intended)
Regression test (e.g. checking if by adding or removing a module of code allows other systems to continue to function as intended)
Stress test (e.g. checking to see how well a system performs under various situations, including heavy usage)
Performance test (e.g. checking to see how efficient a system is as performing the intended task)
Other...

Expected Behavior

We expect all the above tests to pass.

Additional Context

None

Optional - test syft.js with react native

Tensorflow.js is available for React Native:
https://blog.tensorflow.org/2020/02/tensorflowjs-for-react-native-is-here.html
It would be interesting to check if syft.js works in such setting.

Execute protocols in syft.js

After plan execution has been finished in syft.js (#89), we'll need to implement protocols, as per the Protocol refactor (OpenMined/PySyft#2903).

Include Tensorflow.js dependency the right way

Problem

syft.js demo app throws several warnings during startup:

webgl backend was already registered. Reusing existing backend factory.
cpu backend was already registered. Reusing existing backend factory.
Platform browser has already been set. Overwriting the platform with [object Object].
syft.js/src/_helpers.js 33:6-23
"export 'hasOwnProperty' (imported as 'tf') was not found in '@tensorflow/tfjs'

Unit tests display lots of warnings like these:

    console.warn node_modules/@tensorflow/tfjs-core/dist/log.js:26
      
      ============================
      Hi there �. Looks like you are running TensorFlow.js in Node.js. To speed things up dramatically, install our node backend, which binds to TensorFlow C++, by running npm i @tensorflow/tfjs-node, or npm i @t
ensorflow/tfjs-node-gpu if you have CUDA. Then call require('@tensorflow/tfjs-node'); (-gpu suffix for CUDA) at the start of your program. Visit https://github.com/tensorflow/tfjs-node for more details.
      ============================

Expected behavior

1 - It seems that TFJS library is being initialized multiple times. We prefer to produce the bundle where the library is NOT included in syft.js itself, instead it is expected to be loaded separately (like in demo app).

2 - It would be nice to suppress these warnings, or use tfjs-node specifically for unit tests to avoid them.

Remote Addition

In Issue #9 , we implemented functionality that allowed PySyft to send a tensor to syft.js which is saved in a local dictionary called _objects. In this issue, we want PySyft to be able to send JSON to add two tensors together (which have previously been sent to syft.js).

This json command is sent automatically when two PySyft tensors are added together.

import torch

from syft.core.hooks import TorchHook
from syft.core.workers import WebSocketWorker

hook = TorchHook(local_worker=WebSocketWorker(id=0, port=1111, verbose=True))
remote_client = WebSocketWorker(hook=hook,id=2, port=1112, is_pointer=True, verbose=True)
hook.local_worker.add_worker(remote_client)
x = torch.FloatTensor([1,2,3,4,5]).send(remote_client)
x2 = torch.FloatTensor([1,2,3,4,4]).send(remote_client)
y = x + x2

.send is the functionality that already works as of Issue #9

This command should add x and x2 together when "y = x + x2" command is executed (which sends a JSON command to syft.js)

Add a stopping method that stops the training process in syft.js

We need to have a stopping method that will terminate the current job in question. Reasons for stopping training could be any of the following:

The user wanted to... like they clicked a "stop" button
The plan has an error and can't execute
The user started using their device again
The device loses wifi
The device loses active charging
Or perhaps most importantly... if the model isn't really going anywhere (the error rate isn't going down)

At this point, we should stop and notify the user with some sort of message.

Security vulnerability in rollup-plugin-node-builtins 🛠️

Summary

The library rollup-plugin-node-builtins has two security vulnerabilities that are fixed upstream

Other details

Let's find something that's maintained

Write inline documentation for src/object-registry.js

Where?

src/object-registry.js

Who?

Any developer who wants to work on syft.js should be able to read inline documentation for this file, as well as any file in the project.

What?

We need to go "line-by-line" and create inline documentation for this file. Please try to use proper punctuation and English grammar to the best of your ability.

Additional Context

None

Migrate EventObserver, Logger, Serde, and constants to a different repo

These classes and constants are going to be used a lot in both syft.js as well as grid.js. Let's migrate these out to a different repository.

Test syft.js main file

The syft.js project could use a lot of help refining our current test suite.

We need to test the syft.js file as it currently stands. This issue is a bit non-descript, so if you're interested in helping to write tests, please contact me (@cereallarceny) on Slack and I'll get you started!

Write inline documentation for src/job.js

Where?

src/job.js

Who?

Any developer who wants to work on syft.js should be able to read inline documentation for this file, as well as any file in the project.

What?

We need to go "line-by-line" and create inline documentation for this file. Please try to use proper punctuation and English grammar to the best of your ability.

Additional Context

None

Don't send WEBRTC_PEER_LEFT twice

I believe we're already sending this message from grid.js. We should only send it once - I would imagine that grid.js is the best place to send this from, rather than from the client.

Write inline documentation for src/grid-api-client.js

Where?

src/grid-api-client.js

Who?

Any developer who wants to work on syft.js should be able to read inline documentation for this file, as well as any file in the project.

What?

We need to go "line-by-line" and create inline documentation for this file. Please try to use proper punctuation and English grammar to the best of your ability.

Additional Context

None

Connecting PySyft's WebSocketWorker to syft.js

In the PySyft Examples folder, you can see a demo where Python uses WebSockets to send and receive JSON commands.

In this project, we want to add the ability to receive JSON commands from a PySyft client and simply log them out to the screen.

Add minification and source mapping to the build system

Title is self-explanatory.

Create a better Readme

Based on this readme template, we should improve our readmes across all OpenMined projects.

More specifically, you should fill out the template at the minimum.

Don't worry about the logo, I'll get this to you.
Change all badges to reflect your repo, include other badges as desired, but use those at the minimum. You can generate more here: https://shields.io/
Change the title
Write a detailed description of what your library intends to accomplish. I would also advise that you provide links to the following papers: https://ai.googleblog.com/2017/04/federated-learning-collaborative.html, https://arxiv.org/pdf/1902.01046.pdf, https://research.google/pubs/pub47246/. I would also explain that the system is driven by developing a model in PySyft, hosting it in PyGrid, and then downloading it using a worker library. Be sure to also link to the other worker libraries that aren't yours, so we can cross-promote our work!
Fill out a list of features that your library supports. A suggested list includes: PySyft plan execution, optional third-party JWT authentication, wifi detection, charge detection, sleep/wake detection, protocols for secure aggregation (put mark this as "in progress"), and a list of environments this library is expected to work in. That's a short list, you can add or remove what you please.
Installation section should be updated and specify the appropriate package manager (with a link to our deployment page on that package manager)
Usage section should be comprised of the implementation code for the MNIST example. Make sure to clean these up first. I've created an issue for this elsewhere - do that one first.
Fill out some basic contributing information to tell people how to run your library locally, what the local development instructions are, etc.
Fill out the list of contributors from All Contributors. Build this into your workflow and expect to use their Github issue commands in the future to make adding people to the readme easier.

Peer connections not closed

removePeer function when called only closes the data channel.

Shouldn't the peer connection should also get closed when the worker leaves or is removed?

Create an example with a Webpack build

We need to create an identical example to the simple-example folder inside of examples. This will include the ability to load the library in via a build system like Webpack rather than a <script> tag. This shows a "real-world" version of how the library would be used.

Save Tensors in a dict locally

In Issue #8, we built functionality that allowed us to send a tensor from PySyft to syft.js and print it to the console. In this issue, we want to instead save that tensor to a dictionary with a unique ID (specified by the "id" attribute sent with the tensor). This dictionary should be called _objects.

Bring ESLint and testing suite to pass

We currently have a failing testing suite because of the addition of ESLint to our codebase. Please fix all ESLint related issues and do your best to bring our codebase up to speed with tests.

Add Typescript definitions

https://stackoverflow.com/questions/53710368/how-do-i-add-typescript-types-to-a-javascript-module-without-switching-to-typesc

We're not planning on writing syft.js in Typescript, but it would be great to have Typescript definitions so that others may use this project within a TS environment, or allowing for better IDE integration with typing. Adding support for Typescript definitions here should also include tests to be written to prove that data being provided is of a certain type and nothing else. 😄

Add eslint support

We need help adding ESLint to syft.js so that we can more readily detect style issues in our code. Ideally, all configs would be set to error so that we have failing tests until the code is properly written.

This issue is a bit non-descript, so if you're interested in helping configuring ESLint properly, please contact me (@cereallarceny) on Slack and I'll get you started!

Make connection speed test optional

Reference this PyGrid issue: OpenMined/PyGrid-deprecated---see-PySyft-#557

Only run the speed test if the requires_speed_test field from the authentication response is true.

Do this: OpenMined/PyGrid-deprecated---see-PySyft-#557 (comment)

Add gpu.js as a dependency

In the long run, we will very likely want to use a separate tensor library for all our javascript tensor operation needs. The first one we'd like to evaluate is gpu.js (http://gpu.rocks/). So, to start, this issue is to import that library in as a dependency.

Implement the command translation layer inside of syft.js

The Threepio project is set to replace the convertToTF() function in syft.ks. This function currently only provides rudimentary, and very naive, conversion of commands from PyTorch to TFJS. Threepio will focus on a 3-way translation for the majority of the commands in PyTorch, TensorFlow, and TFJS.

Once Threepio is in a good state, we'll want to include this library inside of syft.js to allow for plan command conversion on the fly and in any direction we choose.

The following issues must be completed first:

[syft.js] Handle large messages in data channel

Data channel is limited to a certain size (depending on the browser) with the safe max value of around 16kb.
Update webrtc.js's sendMessage to handle large binary messages, such as protobuf blobs.

This involves:

Split large messages into parts
Basic protocol for split data, like header and number of parts (+ we don't need to split small messages)
Control of bufferedAmount
Concat on the other side and error checking

Example:
https://webrtc.github.io/samples/src/content/datachannel/datatransfer/

Deserialize JSON string into new tensor class

if you have a javascript variable

message = {"torch_type": "torch.FloatTensor", "data": [1.0, 2.0, 3.0, 4.0, 5.0], "id": 1476041248, "owners": [0], "is_pointer": false}

We want a new javascript function called "receive_tensor" which will convert that JSON message into this object with all the attributes contained in the dictionary.

This object should have exactly 1 method "add" which can add two of these objects together. For example...

var x = receive_obj(message)
var y = x.add(x)

Testing sockets

We really need some test coverage around our WebSocket code. Currently, this has tests written for it, but they're definitely out of date. Let's get 100% coverage across the board and then move on to testing our WebRTC integration.

Unit Testing Suite

Create an initial unit testing suite, create a demo unit test, and add the command to run unit tests to the Readme.md

Testing WebRTC

We cannot complete this until we have fully tested Sockets - reference #47

Either way, we need 100% test coverage on WebRTC. This is notoriously difficult to test as it relies on passing tests from our Socket integration as well. They're co-dependent. Separately, testing WebRTC relies on realistic mocking of a WebRTC environment... which is... hard.

Here's some next steps on reading materials:

Execute plans in syft.js

As of now, syft.js is entirely protocol based. Now that we have a new roadmap, we must shift focus to the execution of training plans. This will require initially translating a training plan's list of operations into operations that can be executed in TFJS. This work will be done for us by using Threepio.

Currently, the code we have in syft.js actually runs plans somewhat well, but they're always wrapped in a protocol. This old plan execution code will need to be migrated over after we develop the new API structure in syft.js (#87).

Configure Babel for Jest to work with ES6

In the .babelrc file configure so that Jest will work with ES6 syntax.

Change all instances of instanceId to be workerId

Self-explanatory

openmined / syft.js Goto Github PK

syft.js's Introduction

OpenMined Web Monorepo

Support

Contributing

Local Setup

Courses

Courses E2E Testing

Firebase API

Sanity CMS

syft.js's People

Contributors

Stargazers

Watchers

Forkers

syft.js's Issues

Methods of importing

Referencing in Javascript

Description

Are you interested in working on this improvement yourself?

Additional Context

Where?

Who?

What?

Additional Context

Description

Type of Test

Expected Behavior

Additional Context

Problem

Expected behavior

Summary

Other details

Where?

Who?

What?

Additional Context

Where?

Who?

What?

Additional Context

Where?

Who?

What?

Additional Context

Recommend Projects

Recommend Topics

Recommend Org