You can host your trained machine learning models on AI Platform for predictions (inference) from new examples.
Currently, AI Platform can serve models generated by scikit-learn, xgboost, Pytorch and TensorFlow.
To deploy a model on AI Platform, there are two general steps that need be taken:
- Create a model resource: A model resource is like a folder for all different versions of a model.
gcloud ai-platform models create MODEL_NAME \
--regions REGION
- Create a model version: Once a model resource is created, the model object is uploaded to AI Platform, along with a version.
gcloud ai-platform versions create $VERSION_NAME \
--model $MODEL_NAME \
--origin $MODEL_DIR \
--runtime-version=1.15 \
--framework $FRAMEWORK \
--python-version=3.7
- Custom Prediction Routine
gcloud components install beta
gcloud beta ai-platform versions create $VERSION_NAME \
--model $MODEL_NAME \
--origin $MODEL_DIR \
--runtime-version=1.15 \
--python-version=3.7
--package-uris=$CUSTOM_CODE_PATH \
--prediction-class=$PREDICTOR_CLASS
More information here
After a model version is created, it can be used to make predictions through a simple Python API or using gcloud command. We will show the details of all the required steps in the samples in this directory.
gcloud ai-platform predict --model $MODEL_NAME \
--version $VERSION_NAME \
--json-instances $INPUT_DATA_FILE
More information here
-
Locust: Easily run a load test on models deployed on Cloud AI Platform from GCE, GKE, or local.
-
Model warmup: Learn how to do Model warmup in AI Platform Prediction
For further information on how the prediction on AI Platform works, please click here.