Giter Club home page Giter Club logo

docs.smartnoise.org's Introduction

OpenDP

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. License: MIT

Python R Rust

main CI nightly CI

The OpenDP Library is a modular collection of statistical algorithms that adhere to the definition of differential privacy. It can be used to build applications of privacy-preserving computations, using a number of different models of privacy. OpenDP is implemented in Rust, with bindings for easy use from Python and R.

The architecture of the OpenDP Library is based on a conceptual framework for expressing privacy-aware computations. This framework is described in the paper A Programming Framework for OpenDP.

The OpenDP Library is part of the larger OpenDP Project, a community effort to build trustworthy, open source software tools for analysis of private data. (For simplicity in these docs, when we refer to “OpenDP,” we mean just the library, not the entire project.)

Status

OpenDP is under development, and we expect to release new versions frequently, incorporating feedback and code contributions from the OpenDP Community. It's a work in progress, but it can already be used to build some applications and to prototype contributions that will expand its functionality. We welcome you to try it and look forward to feedback on the library! However, please be aware of the following limitations:

OpenDP, like all real-world software, has both known and unknown issues. If you intend to use OpenDP for a privacy-critical application, you should evaluate the impact of these issues on your use case.

More details can be found in the Limitations section of the User Guide.

Installation

Install OpenDP for Python with pip (the package installer for Python):

$ pip install opendp

Install OpenDP for R from an R session:

install.packages("opendp", repos = "https://opendp.r-universe.dev")

More information can be found in the Getting Started section of the User Guide.

Documentation

The full documentation for OpenDP is located at https://docs.opendp.org. Here are some helpful entry points:

Getting Help

If you're having problems using OpenDP, or want to submit feedback, please reach out! Here are some ways to contact us:

Contributing

OpenDP is a community effort, and we welcome your contributions to its development! If you'd like to participate, please contact us! We also have a contribution process section in the Contributor Guide.

docs.smartnoise.org's People

Contributors

joshua-oss avatar lurosenb avatar pdurbin avatar raprasad avatar shoeboxam avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

lurosenb pdurbin

docs.smartnoise.org's Issues

Start "Discussions" on opendp/opendp

Notes regarding creating a GitHub Discussions Forum on the opendp/opendp repository.

  • Not ideal in that it's at the repository rather than org level.
  • The preference is to only have a single forum for the opendp org
  • Given team resources, it's more manageable to keep everything within GitHub

Tasks:

  • Add GitHub Discussion + initial topics
    • OpenDP Library
    • Differential Privacy (Theory/Practice)
    • SmartNoise Library, SQL, synthesizers
    • Getting Involved
  • Update repository README page to link to the discussion. (Also remove Gitter links when appropriate)
    • docs.opendp.org - pull request #39
    • smartnoise-sdk
    • smartnoise-core
    • smartnoise-core-python
    • smartnoise-samples
    • opendp
    • dpcreator
  • Gitter
    • Link to the new discussions forum

Switch Slack orgs

  • Switch all channels to the OpenDP workspace from the TwoRavens workspace.

ability to preview HTML from pull requests online

@raprasad and I talked about how it would be nice to be able to preview the HTML from pull requests online while reviewing them. This would save reviewers the trouble of building the branch locally.

I asked about this in Write the Docs Slack ( https://www.writethedocs.org/slack/ ) today and was told that Netlify is a good option and the free tier might be enough for us.

Here's a pretty compelling gif from https://www.netlify.com/blog/2016/07/20/introducing-deploy-previews-in-netlify/

deploy-preview-workflow

More details at https://www.netlify.com/products/deploy-previews/

The Toolsday podcast is always saying good things about Netlify, most recently at https://spec.fm/podcasts/toolsday/wKcuYe6l

Convert this repo to SmartNoise docs only

  • The main OpenDP library docs will live in opendp/opendp and be built via CI
  • "Downsize" these docs to only the SmartNoise section
  • Point to the docs.smartnoise.org url
  • Add a link from docs.opendp.org to docs.smartnoise.org - see opendp/opendp#240
  • Rename repository to smartnoise-docs docs.smartnoise.org

Add Announcements section to GH Discussions

Include:

  • SmartNoise: Mayana intro presentation for DAO (check with Mayana/Scott)
  • Ethan/Mayana: Classroom lesson on DP
  • Danny: DV Community presentation
  • Community Meeting Dates
  • OpenDP Fellow program (has started)
    • Check with Annie re: listing projects/names
  • Theory/Practice
    • Programming Framework paper;
    • Ask Andy re: slides

  • OpenDP library (check with Andy/Mike)

Announce

  • mailing list
  • tweet

Resources page in User Guide

Today @raprasad brought up to me and @anniewu332 that it might be nice to list some resources on differential privacy (not specific to OpenDP). For now, Raman and I are thinking there could be a "Resources" page in the User Guide.

Please feel free to edit the following list:

builds from GitHub Actions don't fail on warnings

First, the reason the fail on warnings is that otherwise various content can go missing without anyone realizing it. Tables are especially prone to this. In #32 we noticed that two autogenerated pages were empty.

Local builds (make html) already fail on warnings.

Builds from GitHub Actions (make versions) don't.

Note that even with builds failing on warnings we still have the problem of not knowing if pull requests will break the build or not. #24 is about having an HTML preview of pull requests so that one can visually inspect changes. I suppose we could roll the concept of "doesn't the PR break the build?" into that issue. Or perhaps we could have a separate GitHub Actions job just to know if the build is broken or not (without an HTML preview)?

For now, by switching the build to fail on warnings we'll at least prevent oddness on the live site after a bad pull request is merged. It a step forward.

Document vetting process for OpenDP library contributions

This week we had a meeting to discuss the vetting process for contributing to the OpenDP library. Of particular concern is contribution of new algorithms that would need to be vetted by someone with the appropriate expertise in mathematics and proofs.

We decided to review the contribution guidelines for a few open source projects that we feel are similar in spirit, that also have the need for extended vetting not just of code but of mathematical formulas. Specifically, we had in mind libraries that implement cryptography.

Using a list from Wikipedia as a starting point, I looked at three cryptography libraries I'd heard of and one (Crypto++) that caught my eye.

To summarize my findings:

  • From what I can tell, Bouncy Castle doesn't strongly encourage contributions. (I couldn't find a contributing guide on the Bouncy Castle website.)
  • Both OpenSSL and NSS encourage contribution but don't put particular emphasis on cryptographic code being different than other parts of the code. (See the OpenSSL contributing guide and the NSS code review checklist.)
  • Crypto++ at least indicates that contributions of features and enhancements can be "time consuming because algorithms and their test cases need to be reviewed and merged."1 (See "Source Code and Contributions" on the Crypto++ website.)

(If we'd like to review additional libraries, I started a spreadsheet for the ones above.)

In the course of this review, I came across FIPS 140 which is an official form of vetting from the U.S. government for cryptography. I don't believe we are looking for anything this formal.

Like Crypto++, OpenDP should communicate that contributing algorithms will get extra scrutiny. Other projects like OpenSSL don't seem to put emphasis on this point but it doesn't appear to be necessary.

convert opendp.io emails to opendp.org

Move emails as shown:
https://docs.google.com/spreadsheets/d/1YnOtvQkql4880kk2jhFXlW5isj3NXcqiLEs90RAJ8Is/edit#gid=0

(admin task, doesn't actually belong to this repo)

Does the pull request build?

Distinct from "ability to preview HTML from pull requests online" (#24) we should have the ability to know ahead of time if a pull request is going to build or not. That is, do all the requirements install? Does make html work?

Currently reviewers of pull requests do these checks manually, which is tedious.

I would think we'd be able to tweak our existing GitHub Actions script added in #9 to run the build on all branches but only push to GitHub Pages if it's the "latest" branch? That is, if the pull request has been merged or if the default branch has been updated directly.

I'm open to other ideas, of course!

Cannot built docs: AttributeError: module 'opendp.smartnoise.synthesizers' has no attribute 'QUAILSynthesizer'

This relates to the automated building of Python docs added in pull request #13.

I'm not sure where the QUAILSynthesizer error comes from. I just did a fresh install and here are the versions I'm using:

Successfully installed Jinja2-3.0.1 MarkupSafe-2.0.1 Pygments-2.9.0 alabaster-0.7.12 antlr4-python3-runtime-4.8 babel-2.9.1 beautifulsoup4-4.9.3 certifi-2021.5.30 chardet-4.0.0 docutils-0.16 greenlet-1.1.0 idna-2.10 imagesize-1.2.0 isodate-0.6.0 msrest-0.6.21 numpy-1.20.3 oauthlib-3.1.1 opendp-smartnoise-0.1.4 opendp-smartnoise-core-0.2.2 packaging-20.9 pandas-1.2.4 pandasql-0.7.3 patsy-0.5.1 protobuf-3.17.3 pydata-sphinx-theme-0.6.3 pyparsing-2.4.7 python-dateutil-2.8.1 pytz-2021.1 pyyaml-5.4.1 requests-2.25.1 requests-oauthlib-1.3.0 scipy-1.6.3 six-1.16.0 snowballstemmer-2.1.0 soupsieve-2.2.1 sphinx-3.5.2 sphinx-multiversion-0.2.4 sphinxcontrib-applehelp-1.0.2 sphinxcontrib-devhelp-1.0.2 sphinxcontrib-htmlhelp-2.0.0 sphinxcontrib-jsmath-1.0.1 sphinxcontrib-qthelp-1.0.3 sphinxcontrib-serializinghtml-1.1.5 sqlalchemy-1.4.17 statsmodels-0.12.2 urllib3-1.26.5

I poked around in https://github.com/opendp/smartnoise-sdk/commits/main and https://pypi.org/project/opendp-smartnoise/#history but I'm not sure what changed.

Here's the full output of the error:

(venv) HMDC-beamish:opendp-documentation pdurbin$ make html
sphinx-build -W -D 'html_sidebars.**'=search-field.html,sidebar-nav-bs.html source build/html
Running Sphinx v3.5.2
*****************************************
/private/tmp/sdafssda/opendp-documentation/source/..
/private/tmp/sdafssda
/private/tmp/sdafssda/opendp-documentation/venv/bin
/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python39.zip
/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9
/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload
/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages
*****************************************
ModuleSpec(name='opendp.smartnoise', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10e6dffd0>, origin='/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/opendp/smartnoise/__init__.py', submodule_search_locations=['/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/opendp/smartnoise'])
*****************************************
making output directory... done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 28 source files that are out of date
updating environment: [new config] 28 added, 0 changed, 0 removed
reading sources... [ 50%] smartnoise/api-reference/opendp.smartnoise.core.componreading sources... [ 64%] smartnoise/api-reference/opendp.smartnoise.synthesizerreading sources... [ 67%] smartnoise/api-reference/opendp.smartnoise.synthesizerreading sources... [100%] user/related-projects                                 

Warning, treated as error:
autodoc: failed to import class 'QUAILSynthesizer' from module 'opendp.smartnoise.synthesizers'; the following exception was raised:
Traceback (most recent call last):
  File "/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/sphinx/util/inspect.py", line 393, in safe_getattr
    return getattr(obj, name, *defargs)
AttributeError: module 'opendp.smartnoise.synthesizers' has no attribute 'QUAILSynthesizer'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 111, in import_object
    obj = attrgetter(obj, mangled_name)
  File "/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 320, in get_attr
    return autodoc_attrgetter(self.env.app, obj, name, *defargs)
  File "/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 2604, in autodoc_attrgetter
    return safe_getattr(obj, name, *defargs)
  File "/private/tmp/sdafssda/opendp-documentation/venv/lib/python3.9/site-packages/sphinx/util/inspect.py", line 409, in safe_getattr
    raise AttributeError(name) from exc
AttributeError: QUAILSynthesizer

make: *** [html] Error 2
(venv) HMDC-beamish:opendp-documentation pdurbin$ 

"hidden features" documentation

  • List of features in the code such as other ways to input datasets, extra arguments that allow re-sampling with amplification, etc.
  • This can be a non-exhaustive list to start

Support docs for previous versions (and "latest", "stable", etc.) and languages other than English

It's fairly common for software projects to provide documentation for previous versions like this:

Likewise, often you can see the most recent unreleased version of the docs at "master" (indicating the default branch):

It's also nice to have "stable" indicate the most recent release:

Finally, it's nice to have translations available (if you have translators):

All of this makes me think that we should add "en" and "main" (for our default branch) for now like this:

That way, we have a namespace in which to grow for versions and languages.

My thinking is heavily influenced by Read the Docs, which has a nice write up about versions at https://docs.readthedocs.io/en/stable/versions.html and languages at https://docs.readthedocs.io/en/stable/guides/manage-translations.html

Read the Docs also has a nice picker that lets your switch between versions and languages:

Screen Shot 2021-05-07 at 4 43 46 PM

For now, I don't think we should worry about languages apart from using "en" to give us room the the URL for other languages. I did play a bit with sphinx-intl but actually getting it all set up seems like overkill right now.

For versions I've been playing with sphinx-multiversion and it seems promising.

Contact page, GitHub Discussions vs Google Group

We now have a new "Contact" page at https://docs.opendp.org/en/latest/contact/index.html that looks like this:

Screen Shot 2021-06-11 at 4 01 33 PM

@raprasad and I discussed briefly if we should go ahead and add the "opendp-community" Google Group to this page. This: https://groups.google.com/a/g.harvard.edu/g/opendp-community

The question I have is how to explain the difference between GitHub Discussions and Google Groups. When should one use one or the other? Does it matter?

For now I wrote, "A great way to get in contact with the OpenDP community is GitHub Discussions."

A simple fix would be mention both, as in, "A great way to get in contact with the OpenDP community is GitHub Discussions or the opendp-community Google Group."

Alternatively, we could try to explain when it's best to post to one or the other.

I should also note the following:

  • Once a security email (and process) has been set up in #6 we should add the address to this Contact page. (For now, the Contact page suggests sending security issues to the "info" address.)
  • We don't currently suggest contacting us through GitHub Issues, but we could.
  • https://groups.google.com/a/g.harvard.edu/g/opendp-community says, "If you wish to send a private message to the OpenDP Executive Committee, please email [email protected] instead." We could add this to the Contact page as well.

Thoughts and suggestions are certainly welcome! 😄

Broken link

The link to docs.opendp.org on this page is broken

unable to preview changes

When I introduced the sphinx-multiversion extension in e2f5e5f as part of #15, it had the unintended consequence of preventing us from previewing edits that haven't been committed yet.

That is to say, the extension works by checking out each branch or tag so you only see what's been committed, not what you just edited.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.