Skip to content

Commit

Permalink
Merge pull request #136 from GoogleCloudPlatform/jupyterhub-tpu
Browse files Browse the repository at this point in the history
Adding TPU option to Jupyterhub
  • Loading branch information
chiayi authored Feb 2, 2024
2 parents 5644d55 + f4eac4e commit c872044
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 3 deletions.
2 changes: 1 addition & 1 deletion jupyter-on-gke/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ Continue to Step 3 of [below](#if-auth-is-enabled).
passwords can be found in the [Jupyter
settings](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/jupyter-on-gke/jupyter_config/config-selfauth.yaml) file.
4. Select profile and open a Jupyter Notebook
4. Select profile and open a Jupyter Notebook. For more information on the profiles, visit [this page](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/jupyter-on-gke/profiles.md)
## Persistent Storage
Expand Down
32 changes: 32 additions & 0 deletions jupyter-on-gke/jupyter_config/config-selfauth.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,38 @@ singleuser:
node_selector:
iam.gke.io/gke-metadata-server-enabled: "true"
default: true
- display_name: "TPU"
description: "Creates TPUs VMs as the compute for notebook execution. Will only work if TPU is enabled."
profile_options:
storage:
display_name: "Storage"
choices:
DefaultStorage:
display_name: "DefaultStorage"
kubespawner_override:
default: true
GCSFuse:
display_name: "GCSFuse"
kubespawner_override:
volume_mounts:
- name: gcs-fuse-csi-ephemeral
mountPath: /home/jovyan
volumes:
- name: gcs-fuse-csi-ephemeral
csi:
driver: gcsfuse.csi.storage.gke.io
volumeAttributes:
bucketName: gcsfuse-{username}
mountOptions: "uid=1000,gid=100,o=noexec,implicit-dirs,dir-mode=777,file-mode=777"
node_selector:
iam.gke.io/gke-metadata-server-enabled: "true"
kubespawner_override:
image: jupyter/tensorflow-notebook:python-3.10
extra_resource_limits:
google.com/tpu: "4"
node_selector:
cloud.google.com/gke-tpu-accelerator: "tpu-v4-podslice"
cloud.google.com/gke-tpu-topology: "2x2x1"
- display_name: "GPU T4"
description: "Creates GPU VMs (T4) as the compute for notebook execution"
profile_options:
Expand Down
22 changes: 20 additions & 2 deletions jupyter-on-gke/profiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@ As the description for each profiles explains, each profiles uses a different re

1. First profile uses CPUs and uses the image: `jupyter/tensorflow-notebook` with tag `python-3.10`

2. Second profile uses 2 T4 GPUs and the image: `jupyter/tensorflow-notebook:python-3.10` with tag `python-3.10` [^1]
2. Second profile uses TPUs and will only work when TPU is enabled on the cluster. It uses the image `jupyter/tensorflow-notebook` with tag `python-3.10`

3. Third profile uses 2 A100 GPUs and the image: `jupyter/tensorflow-notebook:python-3.10` with tag `python-3.10` [^1]
3. Second profile uses 2 T4 GPUs and the image: `jupyter/tensorflow-notebook:python-3.10` with tag `python-3.10` [^1]

4. Third profile uses 2 A100 GPUs and the image: `jupyter/tensorflow-notebook:python-3.10` with tag `python-3.10` [^1]

## Editting profiles

Expand Down Expand Up @@ -110,6 +112,22 @@ The possible GPUs are:
7. nvidia-a100-80gb
8. nvidia-l4

### TPUs

Jupyterhub profiles has a TPU option. It utilizes the [nodeSelector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector), and the annotations `cloud.google.com/gke-tpu-accelerator` and `cloud.google.com/gke-tpu-topology` to select the TPU nodes.

We add the following annotations to `kupespawner_override`

```yaml
extra_resource_limits:
google.com/tpu: "4"
node_selector:
cloud.google.com/gke-tpu-accelerator: "tpu-v4-podslice"
cloud.google.com/gke-tpu-topology: "2x2x1"
```

Currently the template only provisions a `ct4p-hightpu-4t` TPU machine with placement `2x2x1`.

### Example profile

Example of a profile that overrides the default values:
Expand Down

0 comments on commit c872044

Please sign in to comment.