Installation

We recommend the installation using Helm as it allows a declarative approach to managing Kubernetes resources.

This guide assumes you are familiar with Helm.

Prerequisites

KFP-Operator

To get a working installation you will need to install both the KFP-Operator and at least one provider (see below)

Build and Install

Create basic values.yaml with the following content:

fullnameOverride: kfp-operator
manager:
  argo:
    serviceAccount: pipeline-runner
  configuration:
    defaultExperiment: Default

Install the latest version of the operator

helm install oci://ghcr.io/kfp-operator/kfp-operator -f values.yaml

You will need to configure service accounts and roles required by your chosen Provider, see here for reference.

Configuration Values

Valid configuration options to override the Default values.yaml are:

Parameter nameDescription
containerRegistryContainer Registry base path for all container images
namespace.createCreate the namespace for the operator
namespace.nameOperator namespace name
manager.argo.containerDefaultsContainer Spec defaults to be used for Argo workflow pods created by the operator
manager.argo.metadataContainer Metadata defaults to be used for Argo workflow pods created by the operator
manager.argo.ttlStrategyTTL Strategy used for all Argo Workflows
manager.argo.stepTimeoutSeconds.compileTimeout in seconds for compiler steps - defaults to 1800 (30m)
manager.argo.stepTimeoutSeconds.defaultDefault timeout in seconds for workflow steps - defaults to 300 (5m)
manager.argo.serviceAccount.nameThe k8s service account used to run Argo workflows
manager.argo.serviceAccount.createCreate the Argo Workflows service account (or assume it has been created externally)
manager.argo.serviceAccount.metadataOptional Argo Workflows service account default metadata
manager.metadataObject Metadata for the manager’s pods
manager.rbac.createCreate roles and rolebindings for the operator
manager.serviceAccount.nameManager service account’s name
manager.serviceAccount.createCreate the manager’s service account or expect it to be created externally
manager.replicasNumber of replicas for the manager deployment
manager.resourcesManager resources as per k8s documentation
manager.configurationManager configuration as defined in Configuration (note that you can omit compilerImage and kfpSdkImage when specifying containerRegistry as default values will be applied)
manager.monitoring.createCreate the manager’s monitoring resources
manager.monitoring.rbacSecuredEnable addtional RBAC-based security
manager.monitoring.serviceMonitor.createCreate a ServiceMonitor for the Prometheus Operator
manager.monitoring.serviceMonitor.endpointConfigurationAdditional configuration to be used in the service monitor endpoint (path, port and scheme are provided)
manager.multiversion.enabledEnable multiversion API. Should be used in production to allow version migration, disable for simplified installation
manager.webhookCertificates.providerK8s conversion webhook TLS certificate provider - choose cert-manager for Helm to deploy certificates if cert-manager is available or custom otherwise (see below)
manager.webhookCertificates.secretNameName of a K8s secret deployed into the operator namespace to secure the webhook endpoint with, required if the custom provider is chosen
manager.webhookCertificates.caBundleCA bundle of the certificate authority that has signed the webhook’s certificate, required if the custom provider is chosen
manager.runcompletionWebhook.endpointsArray of endpoints for the run completion event handlers to be called when a run completion event is passed
logging.verbosityLogging verbosity for all components - see the logging documentation for valid values
statusFeedback.enabledWhether run completion eventing and status update feedback loop should be installed - defaults to false

Examples for these values can be found in the test configuration

Providers

Supported providers are:

  • Kubeflow Pipelines
  • Vertex AI

Install one or more by following these instructions. Please refer to the respective configuration section before proceeding.

Build and Install

Create basic kfp.yaml value file with the following content:

provider:
  name: kfp-provider
  type: kfp
  executionMode: v1
  serviceAccount:
    name: kfp-operator-kfp
    create: false
  configuration:
    kfpNamespace: kubeflow
    restKfpApiUrl: http://ml-pipeline.kubeflow:8888
    grpcMetadataStoreAddress: metadata-grpc-service.kubeflow:8080
    grpcKfpApiAddress: ml-pipeline.kubeflow:8887
    defaultBeamArgs:
      - name: project
        value: ${DATAFLOW_PROJECT}
    pipelineRootStorage: ${PIPELINE_STORAGE}

Install the latest version of the provider

helm install oci://ghcr.io/kfp-operator/provider -f kfp.yaml

Configuration

The provider block contains provider configurations, in order to create relevant Provider Resources.

Parameter nameDescription
nameName given to this provider
typeProvider type (kfp or vai)
serviceAccount.nameName of the service account to run provider-specific operations
serviceAccount.createCreate the service account (or assume it has been created externally)
serviceAccount.metadataOptional service account default metadata
configurationSee Provider Configuration for all available providers and their respective configuration options

Example:

provider:
  name: kfp-provider
  type: kfp
  executionMode: v1
  serviceAccount:
    name: kfp-operator-kfp
    create: false
      ...

Role-based access control (RBAC) for providers

When using a provider, you should create the necessary ServiceAccount, RoleBinding and ClusterRoleBinding resources required for the providers being used.

In order for Event Source Servers and the Controller to read the Providers you must configure their service accounts to have read permissions of Provider resources. e.g:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kfp-operator-kfp-providers-viewer-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kfp-operator-providers-viewer-role
subjects:
- kind: ServiceAccount
  name: kfp-operator-kfp #Used by Event Source Server
  namespace: kfp-operator-system
- kind: ServiceAccount
  name: kfp-operator-controller-manager #Used by KFP Controller
  namespace: kfp-operator-system

An example configuration for Providers is also provided below for reference:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kfp-operator-kfp-service-account
  namespace: kfp-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kfp-operator-kfp-runconfiguration-viewer-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kfp-operator-runconfiguration-viewer-role
subjects:
- kind: ServiceAccount
  name: kfp-operator-kfp-service-account
  namespace: kfp-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kfp-operator-kfp-run-viewer-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kfp-operator-run-viewer-role
subjects:
- kind: ServiceAccount
  name: kfp-operator-kfp-service-account
  namespace: kfp-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kfp-operator-provider-workflow-executor
  namespace: kfp-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kfp-operator-workflow-executor
subjects:
- kind: ServiceAccount
  name: kfp-operator-kfp-service-account
  namespace: kfp-namespace
KubeFlow completion eventing required RBACs

If using the KubeFlowProvider you will also need a ClusterRole for permission to interact with argo workflows for the eventing system.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kfp-operator-kfp-eventsource-server-role
rules:
- apiGroups:
  - argoproj.io
  resources:
  - workflows
  verbs:
  - get
  - list
  - patch
  - update
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kfp-operator-kfp-eventsource-server-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kfp-operator-kfp-eventsource-server-role
subjects:
- kind: ServiceAccount
  name:  kfp-operator-kfp-service-account
  namespace:  kfp-operator-namespace