Giter Club home page Giter Club logo

depa-training's Introduction

DEPA for Training

DEPA for Training is a techno-legal framework that enables privacy-preserving sharing of bulk, de-identified datasets for large scale analytics and training. This repository contains a reference implementation of Confidential Clean Rooms, which together with the Contract Service, forms the basis of this framework. The reference implementation is provided on an As-Is basis. It is work-in-progress and should not be used in production.

Getting Started

GitHub Codespaces

The simplest way to setup a development environment is using GitHub Codespaces. The repository includes a devcontainer.json, which customizes your codespace to install all required dependencies. Please ensure you allocate at least 64GB disk space in your codespace. Also, run the following command in the codespace to update submodules.

git submodule update --init --recursive

Local Development Environment

Alternatively, you can build and develop locally in a Linux environment (we have tested with Ubuntu 20.04 and 22.04), or Windows with WSL 2. Install the following dependencies.

  • docker and docker-compose. After installing docker, add your user to the docker group using sudo usermod -aG docker $USER, and log back in to a shell.
  • make (install using sudo apt-get install make)
  • Python 3.6.9 and pip
  • Python wheel package (install using pip install wheel)

Clone this repo as follows.

git clone --recursive http://github.com/iSPIRT/depa-training

Build CCR containers

To build your own CCR container images, use the following command from the root of the repository.

./ci/build.sh

This scripts build the following containers.

  • depa-training: Container with the core CCR logic for joining datasets and running differentially private training.
  • depa-training-encfs: Container for loading encrypted data into the CCR.

Alternatively, you can use pre-built container images from the ispirt repository by setting the following environment variable.

export CONTAINER_REGISTRY=ispirt

Scenarios

This repository contains two samples that illustrate the kinds of scenarios DEPA for Training can support.

Follow these links to build and deploy these scenarios.

Contributing

This project welcomes feedback and contributions. Before you start, please take a moment to review our Contribution Guidelines. These guidelines provide information on how to contribute, set up your development environment, and submit your changes.

We look forward to your contributions and appreciate your efforts in making DEPA Training better for everyone.

depa-training's People

Contributors

kapilvgit avatar pavankad avatar

Stargazers

Joe avatar Samad Koita avatar Kunal Ranjan avatar Ravjot Singh avatar Dr Shyam Sundaram avatar

Watchers

Dhananjay Nene avatar Siddharth Ashok avatar Siddharth Shetty avatar  avatar Yashvi Jaju avatar Dr Shyam Sundaram avatar

depa-training's Issues

support deployments on AWS

support deployments on AWS using nitro enclaves. The main challenge to solve is the ability to share large encrypted training datasets in nitro enclaves.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.