Skip to content

Latest commit

 

History

History
220 lines (155 loc) · 7.4 KB

README.md

File metadata and controls

220 lines (155 loc) · 7.4 KB

jupyterhub

hosted jupyter notebooks for VDSG's data science café

installation

Our setup is based on deployment of jupyterhub-deploy-docker.

Prerequisites

Docker

This deployment uses Docker, via Docker Compose, for all the things. Docker Engine 1.12.0 or higher is required.

  1. Use Docker's installation instructions to set up Docker for your environment.

  2. To verify your docker installation, whether running docker as a local installation or using docker-machine, enter these commands:

    docker version
    docker ps

HTTPS and SSL/TLS certificate

This deployment configures JupyterHub to use HTTPS. You must provide a certificate and key file in the JupyterHub configuration. To configure:

  1. Obtain the domain name that you wish to use for JupyterHub, for example, myfavoritesite.com or jupiterplanet.org.

  2. If you do not have an existing certificate and key, you can:

  3. A certificate renewal requires rebuild of the containers. Additionally, cert files need to be copied from the letsencrypt folder:

To renew a certificate

# get the new certificate via certbot

cp -RLr /etc/letsencrypt/live/vdsg.at/. /home/foo/projects/jupyterhub/certificates/

docker-compose stop
cd /home/foo/projects/jupyterhub
make build
docker-compose up -d

Authenticator setup

This deployment uses GitHub OAuth to authenticate users.

It requires that you create and register a GitHub OAuth application by filling out a form on the GitHub site:

GitHub OAuth application form

In this form, you will specify the OAuth application's callback URL in this format: https://<myhost.mydomain>/hub/oauth_callback.

After you submit the GitHub form, GitHub registers your OAuth application and assigns a unique Client ID and Client Secret. The Client Secret should be kept private.

At JupyterHub's runtime, you must pass the GitHub OAuth Client ID, Client Secret and OAuth callback url. You can do this by either:

  • setting the GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET, and OAUTH_CALLBACK_URL environment variables when you run the JupyterHub container, or

  • add them to an oauth.env file in the secrets directory of this repository. You may need to create both the secrets directory and the oauth.env file. For example, add the following lines in the oauth.env file:

    oauth.env file

    GITHUB_CLIENT_ID=<github_client_id>
    GITHUB_CLIENT_SECRET=<github_client_secret>
    OAUTH_CALLBACK_URL=https://<myhost.mydomain>/hub/oauth_callback
    

    Note: The oauth.env file is a special file that Docker Compose uses to lookup environment variables. If you choose to place the GitHub OAuth application settings in this file, you should make sure that the file remains private (be careful to not commit the oauth.env file with these secrets to source control).

Build the JupyterHub Docker image

Finish configuring JupyterHub and then build the hub's Docker image. (We'll build the Jupyter Notebook image in the next section.)

  1. Configure userlist: Create a userlist file of authorized JupyterHub users. The list should contain GitHub usernames, and this file should designate at least one admin user. For instance, the example file below contains three users, jtyberg, jenny, and guido, and one designated administrator, jtyberg:

    userlist file

    geoheil  admin
    maksmitk admin
    

    The admin user will have the ability to add more users through JupyterHub's admin console.

  2. Use docker-compose to build the JupyterHub Docker image on the active Docker machine host by running the make build command:

    make build

Spawner: Prepare the Jupyter Notebook Image

You can configure JupyterHub to spawn Notebook servers from any Docker image, as long as the image's ENTRYPOINT and/or CMD starts a single-user instance of Jupyter Notebook server that is compatible with JupyterHub.

To specify which Notebook image to spawn for users, you set the value of the
DOCKER_NOTEBOOK_IMAGE environment variable to the desired container image. You can set this variable in the .env file, or alternatively, you can override the value in this file by setting DOCKER_NOTEBOOK_IMAGE in the environment where you launch JupyterHub.

Whether you build a custom Notebook image or pull an image from a public or private Docker registry, the image must reside on the host.

If the Notebook image does not exist on host, Docker will attempt to pull the image the first time a user attempts to start his or her server. In such cases, JupyterHub may timeout if the image being pulled is large, so it is better to pull the image to the host before running JupyterHub.

This deployment defaults to the jupyter/scipy-notebook Notebook image, which is built from the scipy-notebook Docker stacks. (Note that the Docker stacks *-notebook images tagged 2d878db5cbff include the start-singleuser.sh script required to start a single-user instance of the Notebook server that is compatible with JupyterHub).

You can pull the image using the following command:

make notebook_image

Run JupyterHub

Run the JupyterHub container on the host.

To run the JupyterHub container in detached mode:

docker-compose up -d

Once the container is running, you should be able to access the JupyterHub console at

file

https://myhost.mydomain

To bring down the JupyterHub container and delete all of a user's notebooks:

docker-compose down

FAQ

How can I backup a user's notebook directory?

There are multiple ways to backup and restore data in Docker containers.

Suppose you have the following running containers:

    docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"

    CONTAINER ID        IMAGE                    NAMES
    bc02dd6bb91b        jupyter/minimal-notebook jupyter-jtyberg
    7b48a0b33389        jupyterhub               jupyterhub

In this deployment, the user's notebook directories (/home/jovyan/work) are backed by Docker volumes.

    docker inspect -f '{{ .Mounts }}' jupyter-jtyberg

    [{jtyberg /var/lib/docker/volumes/jtyberg/_data /home/jovyan/work local rw true rprivate}]

We can backup the user's notebook directory by running a separate container that mounts the user's volume and creates a tarball of the directory.

docker run --rm \
  -u root \
  -v /tmp:/backups \
  -v jtyberg:/notebooks \
  jupyter/minimal-notebook \
  tar cvf /backups/jtyberg-backup.tar /notebooks

The above command creates a tarball in the /tmp directory on the host.