Giter Club home page Giter Club logo

binderhub's Introduction

Documentation Status GitHub Workflow Status - Test Latest chart development release GitHub Discourse Gitter Contribute

What is BinderHub?

BinderHub allows you to BUILD and REGISTER a Docker image from a Git repository, then CONNECT with JupyterHub, allowing you to create a public IP address that allows users to interact with the code and environment within a live JupyterHub instance. You can select a specific branch name, commit, or tag to serve.

BinderHub ties together:

  • JupyterHub to provide a scalable system for authenticating users and spawning single user Jupyter Notebook servers, and
  • Repo2Docker which generates a Docker image using a Git repository hosted online.

BinderHub is built with Python, kubernetes, tornado, npm, webpack, and sphinx.

Documentation

For more information about the architecture, use, and setup of BinderHub, see the BinderHub documentation.

Contributing

To contribute to the BinderHub project you can work on:

To see how to build the documentation, edit the user interface or modify the code see the contribution guide.

Installation

BinderHub is based on Python 3, it's currently only kept updated on GitHub. However, it can be installed using pip:

pip install git+https://github.com/jupyterhub/binderhub

See the BinderHub documentation for a detailed guide on setting up your own BinderHub server.

Why BinderHub?

Collections of Jupyter notebooks are becoming more common in scientific research and data science. The ability to serve these collections on demand enhances the usefulness of these notebooks.

Who is BinderHub for?

  • Users who want to easily interact with computational environments that others have created.
  • Authors who want to create links that allow users to immediately interact with a computational enviroment that you specify.
  • Deployers who want to create their own BinderHub to run on whatever hardware they choose.

License

See LICENSE file in this repository.

binderhub's People

Contributors

betatim avatar bitnik avatar bl-aire avatar captainsafia avatar carreau avatar choldgraf avatar consideratio avatar damli40 avatar dependabot[bot] avatar georgianaelena avatar gladysnalvarte avatar hugokerstens avatar hydrosquall avatar jupyterhub-bot avatar kpaschen avatar manics avatar mariusvniekerk avatar miniland1333 avatar minrk avatar mriduls avatar nuest avatar pablobernabeu avatar pre-commit-ci[bot] avatar sblack-usu avatar sgaist avatar sgibson91 avatar u10313335 avatar willingc avatar xhochy avatar yuvipanda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

binderhub's Issues

URL Type: Zip files

This is a future feature idea. Some of the sphinx-gallery devs said that it'd be useful if Binder supported zipfiles that basically had the full contents of a github repository.

So you could give binder a URL to a zip file, and the inside of that file is the contents of a github repository. Binder would unzip first, and then treat the contents the same way as any other repo. Doesn't seem too tough to implement but I wonder if there would be security issues?

cc @Titan-C in case he has any thoughts!

Add support for RISE display

It would be really cool if users could build repositories that were meant to be viewed with RISE. This way people would click a link and they'd be taken to an interactive RISE view automatically, which could be used to step people through material, tell stories in a linear fashion, etc.

This should be tagged as a future feature request but I think it'd be pretty cool if we could implement this such that we use RISE when giving talks about Binder in the future!

Federation support

Me and @fperez came up with a pretty good (and really simple) federation scheme for binders late evening this friday.

I'll document that here in a day or two!

Repository qualities to test + incorporating tests in general

Before Binder 2.0 we should test it out on a few specific repositories to make sure some common usecases work as expected. What should we try?

Now:

  1. Python + requirements.txt
  2. Python + environment.yml
  3. Python + Dockerfile
  4. Python + Pre-binder-2.0 Dockerfile

Future:

  1. R + ???
  2. Python + interactivity/jupyter widgets?
  3. ???

Milestones for summer conferences

Hey folks - @yuvipanda and I met to discuss some milestones we should hit before a few key summer conferences. The first is JuliaCon, where Fernando is giving a talk, and the second is JupyterCon, where we're all giving talks of one kind or another.

I've added two lists to our project page here: https://github.com/jupyterhub/binderhub/projects/1

I've tried to break them down like this:

Task name | (Opt or Req) [ INITIALS ]

Where opt = optional but preferred, and req = required for that conference. The initials correspond to C=chris, Y=yuvi. Feel free to convert any of those into issues or add your initials so that we can start tackling them!

Add UI to generate badges

Right now people don't know how to generate badges unless they've been on gitter! Make a fairly small UI for this.

Add a README

What is this thing, what is the scope of it, why do we want it?

Also how to run it.

Building JavaScript dependencies

We also need to support dependencies for Javascript (maybe including things that require a function call after installing, e.g. some of the jupyter extensions?). Discuss implementations etc here!

Create example binder repos

We need to do at least two things:

  1. Make sure all the old binder examples point to our new examples that cover the same kinds of material
  2. Cover some use-cases many people will have, using each of the languages / build workflows that we support.

Here's where new examples will be located:

https://github.com/binder-examples

Old repos

  • requirements
  • conda_environment
  • remote_storage
  • dockerfile
  • dockerfile_two

New repos

  • legacy dockerfile (using the andrewosh image)
  • from jupyter stacks dockerfile
  • from your own dockerfile

Use cases

  • Installing latex

New usecases for which we should have repos:

image

ToDo for beta.mybinder.org

ToDo before beta

Chris

  • Build the Binder diagram
  • Finalize new text for main page (under 'how it works', since it is slightly different)
  • Make favicon work
  • Figure out how to get images inheriting from 'binder-base' to work. Figure out what is actually in that image, and if we can find the source somewhere
  • Test a lot of binder repositories to make sure we're fully compatible
  • Blog post for the release (#50)

Yuvi

  • Rebuild the UI landing page using HTML/CSS
  • Get an IP address for beta.mybinder.org
  • Fix the progress / logging UX to be understandable
  • Not doing for now: Set up Prometheus / Grafana for metrics on this (assuming this isn't a lot of work, otherwise we move it to later, but I think it'd be good to start collecting data abt users/downtime ASAP)
  • Make sure that currently existing 'launch binder' buttons work
  • Figure out a permalink solution v2
  • Support non-master refs
  • Scale up the beta mybinder deployment (maybe autoscale?)

Min

  • Add support for anaconda yml installs

Information to track from incoming users

We should decide on what pieces of information we'd like to know about where users come from, what is the thing that brought them to SG, etc.

Some specific ideas:

  • Referring location (github.com / personal website / copypaste / sphinx-gallery)

Add support for building from environment.yml

What can I do to help get conda working? Supporting environment.yml is super important for lots of scientific cases where requirements.txt is a non-starter and building from a Dockerfile can take hours. But for now, I don't know what repo I should look at or how to get testing.

Increase bandwidth for notebook data transfer

This will be particularly useful for more data-heavy interactive stuff in the notebook, e.g. ipywidgets. I think this is the relevant config parameter:

c.NotebookApp.iopub_data_rate_limit

Support building arbitrary Dockerfiles

Gotta do it! Lots of people have pretty custom stacks...

Not exactly sure how to restrict the docker images we'll run, or why exactly we should. Should do a proper, full on thorough examination of the security properties...

Blog post

Probably going to riff off of Andrew's original blog post:

https://elifesciences.org/labs/a7d53a88/toward-publishing-reproducible-computation-with-binder

Once we announce the beta we should partially do so with a blog post on the jupyter website, twitter, etc. Here are some things to mention:

  • Principles behind the changes
  • Technical backend differences
  • What's different between current and previous from a user perspective
    • New URL structure
    • The initial environment is different
    • Be more specific / complete about your requirements
    • No more manual building, it gets automatically rebuilt if someone clicks a link
    • This means that if you push to master it'll now automatically get rebuilt
    • However now you can point it to a specific hash / tag / branch
    • It should be much faster after the initial build
  • Public grafana dashboard
  • Hardware restrictions for users
  • Link to documentation for new deployments
  • Future development stuff
    • R support

Suggestions from other folks? @yuvipanda @willingc @minrk ?

Write documentation on getting set up for development

This needs the following pre-requisites:

  1. A Kubernetes cluster that is running and we have API access to
  2. A JupyterHub that's configured properly to run a tmpnb type setup
  3. A Docker Registry to push images to

We should lay out how to set all these up, a sample builder_config.py file and instructions on dev setup

Binderfile config file

At the Jupyter sprint we discussed setting up a "meta-config" file called a Binderfile. I think we should think about implementing this sooner than later, as it will take care of many user concerns (e.g. running shell commands) and will be helpful for extending binder functionality (e.g., telling Binder to start in JupyterLab mode).

General behavior

  • multiple file types supported in a hierarchical fashion (e.g., reqs.txt, env.yml etc)
  • for these, if one if found, then it triggers a build and anything lower in the hierarchy doesn't happen

Specific to binderfile

  • if you want MORE than one of these to be triggered, you need a binder.yml file. This has key: value pairs.
    • the key is the name of one of the files that could be in the hierarchy (e.g. pip, conda)
    • the value is a list of items, one per line, that mimics exactly the content that would have been inside of the file if were a standalone file instead of being inside of binder.yml
  • In this case, each key will be triggered

Example

inside binder.yml:

image: <path-to-image>

pip:
    package1
    package2==2.0
conda:
    package3
bash:
    touch myfile.txt
    python myfile.py
    nbconfig etc etc

Outstanding questions

  • Should each section name be exactly the same as the name of the file if it were standalone? E.g., of the form e.g, use environment.yml: and requirements.txt: above instead of conda: and pip:, respectively.
  • How to support that the value can also point to a file of the same nature (e.g., ./docs/requirements.txt)

Links

Original etherpad

Change the favicon for fail

What do people think about making the favicon change if the build fails? Might be a useful feature since building can take a while sometimes and you'd get a little visual cue in your tab that way...

BUILDER: Environment variables

We should let users define environment variables w/ their repositories.

  • Should this be its own file, or should we just ask people to use a binder.yml file in this case?

API Proposal

This needs a versioned, simple RESTful API that allows for a wide variety of use cases.

Some of the use cases it should cater for:

  1. Working as an on-demand backend for ipywidgets and related toolkits that want to talk to a kernel
  2. mybinder.org like use-cases with various levels of authentication

We should collect other use cases too before finalizing the API design.

Some API design guidelines I like (and should re-read before doing the design):

  1. Google's API Design guidelines https://cloud.google.com/apis/design/
  2. Heroku's API Design guidelines https://geemus.gitbooks.io/http-api-design/content/en/

Berkeley: Mechanical engineering workshop

@aculich could you note a few thoughts / pros and cons about your use of Binder in the event we recently spoke about? Stuff like something that would have made the use of binder more simple, something you expected it to do that it didn't do, that kind of thing. I'd like to keep track of the experience folks have!

Stable URL scheme for badges + links to launch binder

This is different from #13 since it focuses solely on what links people will use when clicking 'launch binder' links, rather than a HTTP API that code can also call. Requires different design and guarantees.

We also need to have a compatibility layer so current launch binder links continue to work.

Current URL structure

Only supported URL structure now is:

/repo/<username>/<repo>/<path-to-start>

While a great start, it has some extensibility problems:

  1. Assumes it is using GitHub, and doesn't provide an easy way to use other providers.
  2. Unclear how it can handle arbitrary git URLs (that aren't hosted in a 'provider' as such)
  3. Not specified how you can provide other arbitrary run-time parameters in jupyterhubs that support it (such as memory limits, extra data to be provisioned, etc)

Whatever we do, we'll make sure these links continue to work for the foreseeable future. We should find and send PRs to people tho to change them.

Proposed new URL structure

/v2/repo/<git-clone-url>/ref/<ref>/?<runtime-params>

where:

  1. git-clone-url is URL encoded URL pointing to a git repository (such as a github URL). We will interpret this pretty generously.
  2. ref points to a commit hash, branch or tag. If it's a branch or tag it'll be redirected to a commit hash - permanently for a tag and temporarily for a branch.
  3. runtime-params is query parameters used for all runtime parameters - including the path to launch, and in the future other parameters too (extra data to load, RAM requirements, etc). These will be formally defined too.

This separates the build parameters (git url + ref) from the runtime parameters. Also establishes that the canonical URL is the one with the commit hash, rather than branch or tag info.

Possible modifications:

  1. Version this API too, with a /v2/ prefix Decided to actually do this.
  2. Special case 'path' runtime parameter, include that in the path. Not sure how necessary that is.

Set up doctr w/ the jupyterhub account to auto-build the docs

I just realized that the docs that are being served on the website are really outdated. I've updated them with a push to gh-pages, but we should get doctr set up to auto-build with travis so we don't need to think about this. I could set it up myself but then it'd be tied to my github account, I think it'd be better if the jupyterhub account were building. What do folks think?

Building R dependencies

We should figure out a way to handle R dependencies. Here are some thoughts from Carl on this:


R mechanism for listing dependencies:

In an R package, the DESCRIPTION file plays the role of a requirements.txt in stating the dependencies, minimal version needed, and where get them (e.g. CRAN or additional cran-type repo like bioconductor).

This approach does not accommodate installing something that is not the most recent version of a package. (CRAN archives old sources, but because, unlike python or ruby gems distribution, CRAN is designed to provide binaries & you can't guarantee binaries build for an old /archived source, the default install does not immediately support installing archived packages).

If you just have a list of packages you want, I recommend something along the lines of what we do with rocker, e.g.

install2.r `cat deps.txt`

Where deps.txt is just a list of package names you want to install. If these come from multiple repos (cran & bioconductor), just list those as arguments to -r:

install2.r -r "https://cran.rstudio.com" -r "https://bioconductor.org/pagkages/release" cat deps.txt

If you want to install the same version each time, just use an MRAN snapshot of the appropriate date.

This installs everything from source of course, and assumes you have the system dependencies (like openssl, or libxml2) installed already, but you probably do since python is the same way. Compiling a full R stack from scratch can take some time; you could save effort by building on any of the existing Rocker stack images (using an R version tag as appropriate).

Alternative approaches which I don't recommend for general use:

  • packrat is a rather heavyweight solution. Packrat is designed to lock you into a particular version of every package. The only way to guarantee you can install the right version of some remote source (could be from github, from CRAN, etc) is to make a local copy of that source, and this is exactly what packrat does. While it creates a manifest listing dependencies, versions, and sources that looks kinda like a requirements.txt file, it is not intended you generate such a manifest by hand, and it only supports the notion of == version, which means it is going to be a nuciance to maintain if you are regularly trying to provide access to an R image with updated software.

  • https://github.com/mangothecat/pkgsnap is a lighter-weight approach, which generates a .csv file of your current library. Again is generated based on your current install rather than than writing out a list by hand, and has no notion of >= dependency, but tracks name, version and source of installation.

These are great for an individual user who wants to lock their library at the current state, which may include an arbitrary mix of up-to-date and not up-to-date packages.


Make this work with minikube

We should possibly allow configuring this to not need an image registry, for the (common?) special case of just having one node (such as minikube). It'll just check local docker instead of registry.

This might eventually allow us to have a single node non-kubernetes setup if needed.

ToDo for Binder v0.1beta

After many months of hard work it is time to launch the new Binder backend! Let's use this issue to coordinate the last few steps we need to take. @yuvipanda and @freeman-lab correct me if I'm wrong, but here's the list as I see it:

ToDo

Tech

  • Set slowspawn timeout to 0 so that pre-built images will launch faster (@yuvipanda)
  • Error pages: incl kubernetes full error + check for other miscellaneous errors to catch #91 (@minrk)
  • tmpnb auth bug causing people to have to clear their cache (@minrk or @yuvipanda)
  • Figure out a UI to create the badges for newly-built binders ( #122 )
  • Figure out the URL for the badges ( #122 )
  • Make sure that the badge SVG will still link properly so images don't break ( #122 )

Documentation

Misc

Release

  • Point mybinder.org to the new Binder deployment (@yuvipanda)
  • Send out a blog post talking about changes etc (@choldgraf) (handled in another repo)

Non blocking

  • Delete stale users
  • UI improvements so it doesn't look like a form (maybe discuss w/ Granger UI team)
  • Get the Binder logo on the notebook pages (@choldgraf)

Update

Here's a whiteboard from JupyterCon:

503 Error After Building

A repository (GitHub link, mybinder link) that was launching successfully as recently as Sunday, July 16th at approximately 5:00 PM (Pacific) started failing to launch by that following morning, Monday, July 17th at approximately 10:00 AM. It was successfully built from the cache, but the new tab that normally contains the environment instead returned a 503 server error. The repo was originally built roughly one month ago (using beta.mybinder.org).

Thinking it might be a caching issue, I forced a rebuild by adjusting some non-critical whitespace. Again, it was successfully built and pushed, but again I got the 503 error. I reproduced this issue with a different repo that had previously deployed successfully.

Any help would be greatly appreciated. This is a great product, and I love using it to share Jupyter notebooks!

Update binderhub throughout repo

Now that we've updated the name we should go through and decide what other things we want to change from builderhub to binderhub.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.