- Installation Targets and Prerequisites
- Standalone Kubeflow Pipelines V1 with Tekton Backend Deployment
- Standalone Kubeflow Pipelines V2 with Tekton Backend Deployment
- Standalone Kubeflow Pipelines with Openshift Pipelines Backend Deployment
- Kubeflow installation including Kubeflow Pipelines with Tekton Backend
- Upgrade to Multi-User KFP-Tekton on Kubeflow
- Troubleshooting
A Kubernetes cluster v1.25
that has least 8 vCPU and 16 GB memory.
- Create an IBM Cloud cluster or if you have an existing cluster, please follow the initial setup for an existing cluster
- Important: Configure the IKS cluster with IBM Cloud Group ID Storage Setup
Depending on your situation, you can choose between the two approaches to set up the pipeline engine on Openshift:
- Using OpenShift Pipelines (built on Tekton), follow the Standalone Kubeflow Pipelines with Openshift Pipelines Backend Deployment
- Using Tekton on Openshift, follow the Standalone Kubeflow Pipelines with Tekton Backend Deployment to install the Kubeflow Pipeline Stack. Note the current Tekton Open Source deployment for Openshift doesn't work out of the box, so we strongly recommend to deploy with Opneshift Pipelines (see above) if you want to run Kubeflow Pipelines on Openshift.
Visit Kubeflow Installation for setting up the preferred environment to deploy Kubeflow.
If you want to deploy locally on KIND, you can run the kubectl Kustomization command below
kubectl apply -k https://github.com/kubeflow/kfp-tekton//manifests/kustomize/env/platform-agnostic-kind\?ref\=v1.8.1
Each new KFP-Tekton version is based on the long-term support of the Tekton Pipeline version and the major release of the Openshift pipeline version. Below is the list of compatible KFP-Tekton version to the Tekton/Openshift pipelines version.
KFP-Tekton Version | Tekton Pipeline Version | OpenShift Pipelines Version | Tekton Core API Version | KFP GRPC Gateway Version |
---|---|---|---|---|
1.5.x | 0.41.x | 1.9 | V1beta1 | 1.16.0 |
1.6.x | 0.44.x | 1.10 | V1beta1 | 1.16.0 |
1.7.x | 0.47.x | 1.11 | V1beta1 | 1.16.0 |
1.8.x | 0.50.x | 1.12 | V1 | 2.11.3 |
1.9.x | 0.53.x | 1.13 | V1 | 2.11.3 |
2.0.3 | 0.47.x | 1.11 | V1beta1 | 1.16.0 |
2.0.5 | 0.53.x | 1.13 | V1 | 1.16.0 |
To install the standalone Kubeflow Pipelines V1 with Tekton , run the following steps:
-
Install Tekton v0.53.2 if you don't have Tekton pipelines on the cluster.
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.53.2/release.yaml
-
Enable necessary Tekton configurations for kfp-tekton
kubectl patch cm feature-flags -n tekton-pipelines \ -p '{"data":{"running-in-environment-with-injected-sidecars": "false"}}' kubectl patch cm config-defaults -n tekton-pipelines \ -p '{"data":{"default-timeout-minutes": "0"}}'
-
Install Kubeflow Pipelines with Tekton backend (
kfp-tekton
)v1.9.2
deploymentkubectl apply -k https://github.com/kubeflow/kfp-tekton//manifests/kustomize/env/kfp-template\?ref\=v1.9.2
-
Then, if you want to expose the Kubeflow Pipelines endpoint outside the cluster, run the following commands:
kubectl patch svc ml-pipeline-ui -n kubeflow -p '{"spec": {"type": "LoadBalancer"}}'
To get the Kubeflow Pipelines UI public endpoint using command line, run:
kubectl get svc ml-pipeline-ui -n kubeflow -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
-
(GPU worker nodes only) If your Kubernetes cluster has a mixture of CPU and GPU worker nodes, it's recommended to disable the Tekton default affinity assistant so that Tekton won't schedule too many CPU workloads on the GPU nodes.
kubectl patch cm feature-flags -n tekton-pipelines \ -p '{"data":{"disable-affinity-assistant": "true"}}'
-
(OpenShift only) If you are running the standalone KFP-Tekton on OpenShift, apply the necessary security context constraint below
curl -L https://raw.githubusercontent.com/kubeflow/kfp-tekton/master/install/v1.9.2/kfp-tekton.yaml | yq 'del(.spec.template.spec.containers[].securityContext.runAsUser, .spec.template.spec.containers[].securityContext.runAsGroup)' | oc apply -f - oc apply -k https://github.com/kubeflow/kfp-tekton//manifests/kustomize/third-party/openshift/standalone oc adm policy add-scc-to-user anyuid -z tekton-pipelines-controller oc adm policy add-scc-to-user anyuid -z tekton-pipelines-webhook
To install the standalone Kubeflow Pipelines V2 with Tekton, run the following steps:
-
Install Kubeflow Pipelines with Tekton backend (
kfp-tekton
)v2.0.5
along with Tektonv0.53.2
kubectl apply -k https://github.com/kubeflow/kfp-tekton//manifests/kustomize/env/platform-agnostic-tekton\?ref\=v2.0.5
-
Then, if you want to expose the Kubeflow Pipelines endpoint outside the cluster, run the following commands:
kubectl patch svc ml-pipeline-ui -n kubeflow -p '{"spec": {"type": "LoadBalancer"}}'
To get the Kubeflow Pipelines UI public endpoint using command line, run:
kubectl get svc ml-pipeline-ui -n kubeflow -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
-
(GPU worker nodes only) If your Kubernetes cluster has a mixture of CPU and GPU worker nodes, it's recommended to disable the Tekton default affinity assistant so that Tekton won't schedule too many CPU workloads on the GPU nodes.
kubectl patch cm feature-flags -n tekton-pipelines \ -p '{"data":{"disable-affinity-assistant": "true"}}'
Now, please use the KFP V2 Python SDK to compile KFP-Tekton V2 pipelines because we are sharing the same pipeline spec starting from KFP V2.0.0.
pip install "kfp>=2.6.0" "kfp-kubernetes>=1.1.0"
To install the standalone Kubeflow Pipelines with Openshift Pipelines, run the following steps:
- Install openshift pipelines (v1.12) from openshift operatorhub:
-
Enable necessary Openshift pipelines configurations for kfp-tekton to enable high performance pipelines.
oc patch cm feature-flags -n openshift-pipelines \ -p '{"data":{"running-in-environment-with-injected-sidecars": "false"}}' oc patch cm config-defaults -n openshift-pipelines \ -p '{"data":{"default-timeout-minutes": "0"}}'
-
Install Kubeflow Pipelines with Openshift pipelines backend (
kfp-tekton
)v1.8.1
deploymentoc apply -k https://github.com/kubeflow/kfp-tekton//manifests/kustomize/env/kfp-template-openshift-pipelines\?ref\=v1.8.1
-
Then, if you want to expose the Kubeflow Pipelines endpoint outside the cluster, run the following commands:
kubectl patch svc ml-pipeline-ui -n kubeflow -p '{"spec": {"type": "LoadBalancer"}}'
To get the Kubeflow Pipelines UI public endpoint using command line, run:
kubectl get svc ml-pipeline-ui -n kubeflow -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
-
(GPU worker nodes only) If your Openshift cluster has a mixture of CPU and GPU worker nodes, it's recommended to disable the Openshift pipelines default affinity assistant so that Openshift pipelines won't schedule too many CPU workloads on the GPU nodes.
oc patch cm feature-flags -n openshift-pipelines \ -p '{"data":{"disable-affinity-assistant": "true"}}'
Important: Please complete the prerequisites before proceeding with the following instructions.
-
Follow the Kubeflow install instructions to install the entire Kubeflow stack with
kfp-tekton
. Kubeflowv1.8.0
uses Tektonv0.47.5
andkfp-tekton
v2.0.3
orv1.7.1
. -
Visit KFP Tekton User Guide and start learning how to use Kubeflow pipeline.
-
Visit KFP Tekton Admin Guide for how to configure kfp-tekton with different settings.
-
Starting from Kubeflow 1.3 and beyond, both Kubeflow single and multi-user deployment use the multi-user mode of Kubeflow pipelines to support authentication. If you haven't installed Kubeflow, Follow the Kubeflow install instructions to install Kubeflow Pipelines with multi-user capabilities.
-
To upgrade to the Multi-User version of KFP-Tekton, custom task controllers, and core Tekton controller, please run
kubectl apply -k manifests/kustomize/env/platform-agnostic-multi-user
If you only want to upgrade the core KFP-Tekton (no custom task and Tekton upgrade), run
kubectl apply -k manifests/kustomize/env/plain-multi-user
-
(For IBM Cloud IKS users) If you accidentally deployed Kubeflow with IBM Cloud File Storage, run the below commands to remove the existing pvc. The below commands are for removing resources in multi-user, so you can ignore any missing pvc or rollout error if you are doing this for single user.
kubectl delete pvc -n kubeflow katib-mysql metadata-mysql minio-pv-claim minio-pvc mysql-pv-claim kubectl delete pvc -n istio-system authservice-pvc kubectl rollout restart -n kubeflow deploy/mysql deploy/minio deploy/katib-mysql deploy/metadata-db kubectl rollout restart -n istio-system statefulset/authservice
Then, redo the Kubeflow install section to redeploy Kubeflow with the appropriate storage setup. Either for a Classic IBM Cloud Kubernetes cluster or a vpc-gen2 IBM Cloud Kubernetes cluster.
-
If you redeploy Kubeflow and some components are not showing up, it was due to the dynamic created webhook issue. This issue will be fixed in the next release of KFP.
kubectl delete MutatingWebhookConfiguration cache-webhook-kubeflow katib-mutating-webhook-config