Skip to content

Commit

Permalink
Fixed broken link in TPU_GUIDE
Browse files Browse the repository at this point in the history
  • Loading branch information
ryanaoleary committed Dec 1, 2023
1 parent da9ea67 commit b8392c8
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ray-on-gke/TPU_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,6 @@ init_jax_from_ray(num_workers=2)

### TPU Multi-Host Workloads

When initializing multi-host TPUs, the environment variables can be set using a mutating admission webhook. The webhook can be deployed following the instructions in the [README](https://github.com/GoogleCloudPlatform/ai-on-gke/tree/kuberay-tpu-env-injector/ray-on-gke/user/kuberay-tpu-webhook#readme).
When initializing multi-host TPUs, the environment variables can be set using a mutating admission webhook. The webhook can be deployed following the instructions in the [README](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/ray-on-gke/kuberay-tpu-webhook#readme).

A caveat when running multiple TPU pod slices of the same topology and type with Ray is that a single Ray worker group may be scheduled across multiple pod slices. This goes against the assumptions of the webhook and would lead to pod-to-pod communication occuring over DCN rather than the high bandwidth ICI mesh.

0 comments on commit b8392c8

Please sign in to comment.