DOCA Platform Framework

DPUCluster

The DPUCluster is a Kubernetes CRD that manages the control plane of a DPUCluster in DPF. The DPUCluster can be backed by different implementations.

Two implementations are included in this repository:

  • Kamaji cluster manager - creates Kamaji TenantControlPlanes to back the DPUCluster

  • Static cluster manager - transforms an existing Kubernetes control plane into a DPUCluster control plane

DPUCluster Usage

A DPUCluster is a user API and the usage will differ depending on the implementation.

Using Static Cluster Manager

The static cluster manager controller should be enabled first. It is enabled by adding staticClusterManager field in the DPUOperatorConfig CR:

YAML
apiVersion: operator.dpu.nvidia.com/v1alpha1
kind: DPFOperatorConfig
metadata:
  name: dpfoperatorconfig
  namespace: dpf-operator-system
spec:
  provisioningController:
    bfbPVCName: "bfb-pvc"
  staticClusterManager: {}

Then create a secret for storing the kubeconfig of the existing Kubernetes control plane. For example, the kubeconfig is under the home directory:

TENANT_KUBE_CONFIG=`cat ~/.kube/config | base64 -w 0`

cat <<EOF | kubectl apply -f -
apiVersion: v1
data:
  admin.conf: ${TENANT_KUBE_CONFIG}
kind: Secret
metadata:
  name: dpu-cluster-1-admin-kubeconfig
  namespace: dpf-operator-system
type: Opaque
EOF

The DPUCluster will look like:

YAML
apiVersion: provisioning.dpu.nvidia.com/v1alpha1
kind: DPUCluster
metadata:
  name: dpu-cluster-1
  namespace: dpf-operator-system
spec:
  ## type signals which controller implementation should take responsibility for the DPUCluster.
  type: static
  ## Max nodes is the maximum number of nodes supported by the DPUCluster implementation.
  maxNodes: 10
  ## Version is the version of the Kubernetes control plane.
  version: v1.30.2
  ## Kubeconfig is the name of a secret in the same namespace as the DPUCluster object.
  ## Note: This field is supplied by the user in the static cluster manager - but this may not be the case for other implementations.
  kubeconfig: dpu-cluster-1-admin-kubeconfig

Using Kamaji Cluster Manager

The DPUCluster will look like:

YAML
apiVersion: provisioning.dpu.nvidia.com/v1alpha1
kind: DPUCluster
metadata:
  name: dpu-cluster-1
  namespace: dpf-operator-system
spec:
  ## type signals which controller implementation should take responsibility for the DPUCluster.
  type: kamaji
  ## Max nodes is the maximum number of nodes supported by the DPUCluster implementation.
  maxNodes: 10
  ## Version is the version of the Kubernetes control plane.
  version: v1.30.2
  ## Cluster endpoint is supplied by the user and provides and IP and other details to make the APIServer available. 
  clusterEndpoint:
    # deploy keepalived instances on the nodes that match the given nodeSelector.
    keepalived:
      # interface on which keepalived will listen. Should be the oob interface of the control plane node.
      interface: interface_one
      # vip is the Virtual IP reserved for the DPU Cluster load balancer. Must not be allocatable by DHCP.
      vip: dpucluster_vip
      # virtualRouterID must be in range [1,255], make sure the given virtualRouterID does not duplicate with any existing keepalived process running on the host
      virtualRouterID: 126
      # nodeSelector selects which nodes the keepalived pods will be scheduled to.
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""

DPUCluster Implementation

A DPUCluster implementation is a Kubernetes controller which operates on the DPF DPUCluster object. The DPUCluster implementation should:

  • Only operates on a DPUCluster which has a type it is responsible for

  • Be the only DPUCluster controller implementation in a cluster

  • Provide an admin Kubeconfig to a functioning Kubernetes cluster as a Kubernetes Secret

  • Ensure the name of that Secret is available in the .spec.kubeconfig of the DPUCluster object

The Kubeconfig provided by the DPUCluster should have the following format:

YAML
apiVersion: v1
kind: Secret
metadata:
  name: dpu-cluster-1
  namespace: dpf-operator-system
type: Opaque
data:
  admin.conf: $KUBECONFIG_DATA

Last updated: