Giter Club home page Giter Club logo

jupyterhub-on-k8s's Introduction

JupyterHub on K8s

Overview

This repository is intended to provide an introduction to deploying JupyterHub on Kubernetes, particularly based on bare Ubuntu servers. In the ./binderhub subdirectory is a guide to deploying BinderHub (which includes JupyterHub as a back end service).

Prerequisite

  • At least 2 Ubuntu VMs (1 head node, 1 worker node). Head node must be externally accessible, preferably via port 80 which is where we will expose the JupyterHub service. Nodes can have different base OS, but bash scripts will need to be tweaked.

Create a K8s cluster

Run Head node script, this will install any prerequisite packages, Docker, Kubernetes 1.20.0, Helm, NFS.

$ bash runOnHeadNode.sh

Initialise the cluster, set up networking via Flannel by default, and enable bash autocompletion for kubectl. Calico can be used for networking if you do not require MetalLB, or if your cloud provider can provide a load balancer. Here Flannel is used due to current issues with Calico + MetalLB.

$ bash setupCluster.sh

Run worker node script on each worker node, this will install any prerequisite packages, Docker, Kubernetes 1.20.0, NFS.

$ bash runOnWorkerNode.sh

Run the kubeadm join command on each node, this can be found in the output of the cluster initialisation either on your terminal or will be saved in clusterInit.out.
From this point, you could jump straight to setting up the JupyterHub service and use the fallback options in the jhub config file.

Optional: Import your cluster into Rancher if you are using it to centrally manage your clusters (ssh -L 8443:localhost:443 -Nfl ubuntu 130.246.212.36).

(Optional) Monitoring the cluster

This is not essential to the functionality of JupyterHub, but a nice to have.

  1. Create a monitoring namespace
$ kubectl create namespace monitoring
  1. Deploy the prometheus stack (prometheus, grafana, node exporters, and CRDs) https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
$ helm install prometheus-stack -n monitoring prometheus-community/kube-prometheus-stack -f prometheus/prometheus-values.yaml
  1. Set up an ssh tunnel (if needed) to view the grafana dashboard by looking at the port number for the grafana service with something like:
$ ssh -L 30290:localhost:30290 -Nfl rjoshi [external facing IP of the head node]
  1. Login to Grafana with admin/prom-operator (edit yaml to change this)

Deploy NFS for JupyterHub storage (hub + users)

Add helm repo for NFSProvisioner

$ helm repo add kvaps https://kvaps.github.io/charts
$ helm repo update

Deprecated

Helm stable repo to access NFS provisioner helm chart

$ helm repo add stable https://kubernetes-charts.storage.googleapis.com/
$ helm repo update 

Create K8s namespace nfsprovisioner and deploy NFS via helm chart with the following command. This will also set the default storage class to be NFS.

$ kubectl create namespace nfsprovisioner
$ helm install nfs stable/nfs-server-provisioner --namespace nfsprovisioner --set=storageClass.defaultClass=true

It is recommended to have the NFSProvisioner backed up with some persistence storage (this will hold the mapping between the volumes provisioned and the claims/pod/services they are consumed by). Example of how to do this with a hostPath volume below. Note:

  1. Create the path prefix on one of your worker nodes mentioned in nfs-pv.yaml under spec.hostPath.path
  2. The spec.capacity.storage in nfs-pv.yaml should match the persistence.size in nfsprovisioner.yaml
  3. The name of the node where the hostPath is created should be mentioned in nfsprovisioner.yaml under nodeSelector.kubernetes.io/hostname
$ kubectl apply -f nfs-pv.yaml -n nfsprovisioner
$ helm install nfs kvaps/nfs-server-provisioner -n nfsprovisioner --version 1.3.0 -f nfsprovisioner.yaml 

Add a PostGresDB for the hub service

This is technically optional. Add helm charts, create kubernetes namespace, and install a PostGresDB via helm chart with

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm repo update
$ helm upgrade --install pgdatabase --namespace pgdatabase bitnami/postgresql \
--set postgresqlDatabase=jhubdb \
--set postgresqlPassword=postgres

JupyterHub

See Helm chart to JupyterHub App version mapping here: https://jupyterhub.github.io/helm-chart/#jupyterhub

The JupyterHub config file can be used to customise your deployment by overriding defaults in the base helm chart. A sample jupyterhub config file is jupyterhub-config.yaml and it can be tweaked as per your requirements. Create proxy secret token and hub's cookie secret with openssl rand -hex 32

As before we will create a kubernetes namespace for the application ie jhub, add the helm repo, and install Jhub helm chart

$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
$ helm repo update
$ helm upgrade --install jhub jupyterhub/jupyterhub --namespace jhub --version=0.11.0 --values jupyterhub-config.yaml

Load Balancer for bare metal cluster

Here we use Metallb 0.9.3. This is required for bare metal clusters or if your cloud provider doesnt provide a load balancer for K8s clusters.
Prerequisite: enable strict ARP mode by editing the kube-proxy configmap, search for IPVS, and set strictARP to true.

$ kubectl edit configmap -n kube-system kube-proxy

Apply namespace yaml.

$ kubectl apply -f metallb/namespace.yaml

Apply metallb yaml.

$ kubectl apply -f metallb/metallb.yaml

Create secret on first install only

$ kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

MetalLB will be idle until you apply the config map that lists the addresses/address pools available. (edit this based on the IPs you have available!).

$ kubectl apply -f metallb/metal_config.yaml

NGINX reverse proxy on the head node

Create logs dir, copy in the config file and reload NGINX

$ sudo mkdir /root/logs
$ sudo cp nginx/default /etc/nginx/sites-available/default
$ sudo nginx -t
$ sudo systemctl reload nginx

You should be able to view the JupyterHub service at http://<IP>/hub/login/

jupyterhub-on-k8s's People

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.