Deploying Vision Models (TensorFlow) from 🤗 Transformers

This repository shows various ways of deploying a vision model (TensorFlow) from 🤗 Transformers using the TensorFlow Ecosystem. In particular, we use TensorFlow Serving (for local deployment), Vertex AI (serveless deployment), Kubernetes and GKE (more controlled deployment) with TensorFlow Serving and ONNX.

For this project, we leverage Google Cloud Platform for using managed services Vertex AI and GKE.

Methods covered

Local TensorFlow Serving | Blog post from 🤗
- We cover how to locally deploy a Vision Transformer (ViT) model from 🤗 Transformers with TensorFlow Serving.
- With this, you will be able to serve your own machine learning models in a standalone Python application.
TensorFlow Serving on Kubernetes (GKE) | Blog post from 🤗
- We cover how to build a custom TensorFlow Serving Docker image with Vision Transformer (ViT) model from 🤗 Transformers, provision Google Kubernetes Engine(GKE) cluster, deploy the Docker image to the GKE cluster.
- Particularly, we cover Kubernetes specific topics such as creating Deployment/Service/HPA Kubernetes objects for scalable deployment of the Docker image to the nodes(VMs) and expose them as a service to clients.
- With this, you will be able to serve and scale your own machine learning models according to the CPU utilizations of the deployment as a whole.
- We provide utilities to perform load-test with Locust and visualization notebook as well. Refer here for more details.
ONNX on Kubernetes (GKE)
- The workflow here is similar to the above one but here we used an ONNX-optimized version of the ViT model.
- ONNX is particularly useful when you're deploying models using x86 CPUs.
- This workflow doesn't require you to build any custom TF Serving image.
- One important thing to keep in mind is to generate the ONNX model in a machine type which is the same as the deployment hardware. This means if you're going to use the n1-standard-8 machine type for deployment, generate the ONNX model in the same machine type to ensure ONNX optimizations are relevant.
Vertex AI Prediction | Blog post from 🤗
- We cover how to deploy Vision Transformer (ViT) model from 🤗 Transformers to Google Cloud's fully managed machine learning deployment service (Vertex AI Prediction).
- Under the hood, Vertex AI Prediction leverages all the technologies from GKE, TensorFlow Serving, and more.
- That means you can deploy and scale the deployment of machine learning models, but you don't need to worry about building a custom Docker image or writing Kubernetes-specific manifests, or setting up model monitoring capability.
- With this, you will be able to serve and scale your own machine learning model by calling various APIs from google-cloud-aiplatform SDK to interact with Vertex AI.
- We provide utilities to perform load-test with Locust. Refer here for more details.
Vertex AI Prediction (w/ optimized TFRT)
- TBD
- Know more about the optimized TFRT(TensorFlow RunTime) here.

Acknowledgements

We're thankful to the ML Developer Programs team at Google that provided GCP support.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
hf_vision_model_onnx_gke		hf_vision_model_onnx_gke
hf_vision_model_tfserving_gke		hf_vision_model_tfserving_gke
hf_vision_model_vertex_ai		hf_vision_model_vertex_ai
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hf_vision_model_tfserving.ipynb		hf_vision_model_tfserving.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploying Vision Models (TensorFlow) from 🤗 Transformers

Methods covered

Acknowledgements

About

Releases 3

Packages

Contributors 2

Languages

License

sayakpaul/deploy-hf-tf-vision-models

Folders and files

Latest commit

History

Repository files navigation

Deploying Vision Models (TensorFlow) from 🤗 Transformers

Methods covered

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages