Giter Club home page Giter Club logo

ektelo's Introduction

Ektelo

Ektelo is an operator-based framework for implementing privacy algorithms. It was first presented at SIGMOD 2018:

In the documentation below, this is referred to as the "Ektelo paper."

Licensed under Apache License, Version 2.0.

Overview

Architecture

There are two complementary objectives of the Ektelo project:

  1. Isolate private interactions with data in a compact and secure kernel.
  2. Modularize privacy-related algorithms into operators, which promote code reuse and assist in keeping the kernel compact and secure.

The layout of the Ektelo repository reflects these goals. Code that is intended to run on a private server is found in the module ektelo/private, while non-private, client code is located in the module ektelo/client. We assume that the kernel will be setup on a private server by an entity with access to the unaltered, private data. Along with the kernel, a kernel service responsible for servicing client requests must also be setup on the server. On the client side, a privacy engineer creates a protected data source, which mediates all interactions with the kernel via communication with the kernel service.

Ektelo is designed to support interactive data queries from the privacy engineer to the kernel. To do so, a separate kernel instance is instantiated with a specific privacy budget for every user. At the kernel, the total privacy expenditure is tracked for each query according to Algorithm 6 in the Ektelo paper. User queries are serviced until the budget has been exceeded. At that point, a BudgetExceeded error is sent back to the user.

Examples

  1. File examples/cdf_estimator.py provides an example of the entire Ektelo workflow. This example aligns with Algorithm 1 from the Ektelo paper.
  2. File examples/standalone_plan.py provides an example of a previously published algorithm expressed as an Ektelo plan consisting of a sequence of Ektelo operators. The algorithm in this case is MWEM (Hardt et al. "A Simple and Practical Algorithm for Differentially Private Data Release." NIPS 2012). Note this example excludes the layer that manages the interaction between client code and the protected kernel. While removing this layer makes it easier to trace the plan, it also removes the privacy protection (i.e., the variable R corresponds to the input dataset so adding print(R) would result in full disclosure of the "private" input). We imagine that writing Ektelo plans in this "stripped down" form may be useful for privacy researchers who are designing new algorithms and only executing on non-sensitive inputs.
  3. File examples/private_plan.py is the same as the previous example (standalone_plan.py) except that it includes the layer that manages client-kernel interaction. In this example, any interactions with the private data are mediated by the kernel, which will ensure protection. In particular, the R variable is now a ProtectedDataSource and invoking a method on R will trigger an interaction with the kernel. This example illustrates how a complex differentially private algorithm can be executed via client calls to the protected kernel.
  4. File examples/budget_exceeded.py provides an example of a client-kernel interaction that produces such a BudgetExceeded error.

Examples 2 and 3 above illustrate the MWEM algorithm written as an Ektelo plan. Other algorithms from the literature have also been written as plans in two places: plans/standalone.py and plans/private.py. The standalone plans exclude the client-kernel layer (similar to example 2 above) and the private plans include it (similar to example 3 above).

Setup

Example Environment

export EKTELO_HOME=$HOME/Documents/ektelo
export EKTELO_DATA=/tmp/ektelo
export PYTHON_HOME=$HOME/virtualenvs/PyEktelo
export PYTHONPATH=$PYTHONPATH:$EKTELO_HOME
export EKTELO_LOG_PATH=$HOME/logs
export EKTELO_LOG_LEVEL=DEBUG

System dependencies

Various system-level packages are necessary to meet the requirements for third-party python modules installed during initialization. The dependencies vary by platform. It is strongly recommended to use python version 3.6 or higher.

Ubuntu 16.04 Packages

sudo apt-get install gfortran liblapack-dev libblas-dev
sudo apt-get install libpq-dev python3-dev libncurses5-dev swig glpk

OSX packages

brew install swig

Initialization

Be sure to setup the environment (describe above) first. You will need to install several packages. The following commands should work for debian systems.

Next, create a virtual environment for python by entering the commands below.

mkdir $EKTELO_LOG_PATH
python3 -m venv $PYTHON_HOME
source $PYTHON_HOME/bin/activate
cd $EKTELO_HOME
pip install -r resources/requirements.txt

Note: We recommend installing python modules with the same versions specified in resources/requirements.txt. However, if you are running python version greater than 3.6, then it is possible that you will need to increase the module versions as well. This can be accomplished by replacing == with >= in the requirements file.

The data must be downloaded into the $EKTELO_DATA folder.

mkdir -p $EKTELO_DATA
curl https://www.dpcomp.org/data/cps.csv > $EKTELO_DATA/cps.csv
curl https://www.dpcomp.org/data/stroke.csv > $EKTELO_DATA/stroke.csv

Finally, after instantiating the virtualenv, compile the C libraries as follows.

cd $EKTELO_HOME/ektelo/algorithm
./setup.sh

Session

Once initialization has been run, the virtual environment can be restored with the following command.

source $PYTHON_HOME/bin/activate

Testing

Execute the following in the base of the repository.

cd $EKTELO_HOME
nosetests

To test a specific module (in this case, TestExperiment):

nosetests test.unit.test_data:TestData

ektelo's People

Contributors

horseno avatar michaelghay avatar gbiss avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.