Skip to content

Train TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure

License

Notifications You must be signed in to change notification settings

wbuchwalter/tensorflow-k8s-azure

Repository files navigation

⚠️ This repository is deprecated! Go to Azure/kubeflow-labs instead ⚠️

Train TensorFlow Models at Scale with Kubernetes on Azure

Prerequisites

  1. Have a valid Microsoft Azure subscription allowing the creation of an ACS cluster
  2. Docker client installed: Installing Docker
  3. Azure-cli (2.0) installed: Installing the Azure CLI 2.0 | Microsoft Docs
  4. Git cli installed: Installing Git CLI
  5. Kubectl installed: Installing Kubectl
  6. Helm installed: Installing Helm CLI (Note: On Windows you can extract the tar file using a tool like 7Zip.

Clone this repository somewhere so you can easily access the different source files:

git clone https://github.com/wbuchwalter/tensorflow-k8s-azure

Content Summary

Module Description
0 Introduction Introduction to this workshop. Motivations and goals.
1 Docker Docker and containers 101.
2 Kubernetes Kubernetes important concepts overview.
3 Helm Introduction to Helm
4 GPUs How to use GPUs with Kubernetes.
5 TFJob How to use tensorflow/k8s and TFJob to deploy a simple TensorFlow training.
6 Distributed Tensorflow Going distributed with TFJob
7 Hyperparameters Sweep with Helm Using Helm to deploy a large number of training testing different hypothesis, monitoring and comparing them.
8 Going Further Links and resources to go further: Autoscaling, Distributed Storage.
9 Jupyter Notebooks Easily deploy a Jupyter Notebook instance on Kubernetes.

About

Train TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •