This document describes how to use DPUServiceChain in DPF.
Introduction
The purpose of DPUServiceChain is to allow user to define how to steer traffic on DPU through DPUServiceInterfaces. The following controllers are used internally to achieve this.
-
User creates DPUServiceChain, servicechaincontroller consumes it on the host cluster.
-
ServiceChainSet is created on DPU clusters
-
ServiceChain is created for individual nodes based on nodeSelector.
-
SFC controller provisions the necessary network configurations on DPU.
How to Use DPUServiceChain
DPUServiceChain example:
The following YAML manifest defines a DPUServiceChain named example-chain and refers to one DPUService named example-service and 4 DPUServiceInterfaces. DPUServiceChain will define how the traffic will flow through those DPUServiceInterfaces.
DPUServiceInterfaces
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUServiceInterface
metadata:
name: p0
namespace: dpf-operator-system
spec:
template:
spec:
template:
metadata:
labels:
uplink: "p0"
spec:
interfaceType: physical
physical:
interfaceName: p0
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUServiceInterface
metadata:
name: pf0hpf
namespace: dpf-operator-system
spec:
template:
spec:
nodeSelector:
matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- "linux"
template:
metadata:
labels:
uplink: "pf0hpf"
spec:
interfaceType: pf
pf:
pfID: 0
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUServiceInterface
metadata:
name: eth1
namespace: dpf-operator-system
spec:
template:
spec:
template:
metadata:
labels:
svc.dpu.nvidia.com/interface: "eth1"
svc.dpu.nvidia.com/service: example-service
spec:
interfaceType: service
service:
serviceID: example-service
network: mybrsfc
interfaceName: eth1
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUServiceInterface
metadata:
name: eth2
namespace: dpf-operator-system
spec:
template:
spec:
template:
metadata:
labels:
svc.dpu.nvidia.com/interface: "eth2"
svc.dpu.nvidia.com/service: example-service
spec:
interfaceType: service
service:
serviceID: example-service
network: mybrsfc
interfaceName: eth2
DPUService
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUService
metadata:
name: example-service
namespace: dpf-operator-system
spec:
serviceID: example-service
interfaces:
- eth1
- eth2
helmChart:
source:
repoURL: https://helm.ngc.nvidia.com/nvidia/doca
version: 1.0.1
chart: example-service
values:
resources:
memory: 6Gi
DPUServiceChain
apiVersion: svc.dpu.nvidia.com/v1alpha1
kind: DPUServiceChain
metadata:
name: example-chain
namespace: dpf-operator-system
spec:
template:
spec:
template:
spec:
switches:
- ports:
- serviceInterface:
matchLabels:
uplink: p0
- serviceInterface:
matchLabels:
svc.dpu.nvidia.com/service: example-service
svc.dpu.nvidia.com/interface: eth1
- ports:
- serviceInterface:
matchLabels:
svc.dpu.nvidia.com/service: example-service
svc.dpu.nvidia.com/interface: eth2
- serviceInterface:
matchLabels:
uplink: pf0hpf
serviceMTU: 2000
Let's break it down step by step.
-
There are 4 DPUServiceInterfaces
-
uplink port p0
-
uplink port pf0hpf on host
-
service interface eth1
-
service interface eth2
-
-
There is one DPUService
-
example-service which has two interfaces eth1 and eth2
-
-
There is one DPUServiceChain
-
example-chain
p0 --> eth1 --> eth2 --> pf0hpf
-
-
On the switches object there is
serviceMTUwhich defines the MTU between services in the DPU cluster-
The default value for this MTU is 1500
-
NOTE: This only affects services on the DPU, it should be aligned with the general MTU set for the traffic in the network
-
NOTE: That when changed, it will restart every pod that relates to this switch
-
NOTE: The maximum value of
serviceMTUcannot exceed thehighspeedMTUvalue from thedpfOperatorConfig
-
In the above example, traffic will flow from uplink port p0 to example DPU service's eth1 iface. From eth1 iface, it will go to eth2 iface(eth1->eth2 is handled by the service itself and not by the chain) and then to uplink port pf0hpf on the host.
Constraints
DPUServiceInterface and ServiceInterface Uniqueness
Each physical uplink port (e.g., p0) must be owned by exactly one DPUServiceInterface, and each matchLabels selector used in a DPUServiceChain must resolve to exactly one ServiceInterface per node. These constraints are not enforced at admission time — there are no validating webhooks that reject conflicting objects. Instead, violations are detected at runtime by the SFC controller and surface as persistent errors on the ServiceChain (and therefore the ServiceChainSet).
Why this matters
A physical OVS port can only be associated with a single ServiceInterface via its dpf-id external ID. If two DPUServiceInterface objects target the same physical port, only one will own the OVS port at any given time. The other's ServiceChain will fail to find its OVS interface, causing errors and flapping readiness. The SFC controller also requires that the matchLabels selector in each DPUServiceChain port resolves to exactly one ServiceInterface on each node. If the selector matches zero or more than one, reconciliation fails.
Error messages
When the matchLabels selector matches no ServiceInterface on a node:
no serviceInterface in namespace(<ns>) matching labels(map[<key>:<value>]) on node(<node>) found
ServiceInterface on a node:
expected only one serviceInterface in namespace(<ns>) to match labels(map[<key>:<value>]) on node(<node>). found <N>
ServiceInterface:
failed to find matching interface with external_ids: map[dpf-id:<namespace>/<serviceinterface-name>]
ServiceChain (and therefore the ServiceChainSet) will remain Ready=False or flip between Ready and Pending.
Common causes
-
Conflict with another service — if a DOCA service such as HBN already installs a
DPUServiceInterfacefor a physical port (e.g.,p0with labeluplink: p0), creating anotherDPUServiceInterfacefor the same physical port will cause a conflict. Even if the labels differ, both produceServiceInterfaceobjects on each node, but the OVS port can only carry onedpf-id. The chain referencing the non-owningServiceInterfacewill fail withfailed to find matching interface. If the labels are identical, thematchLabelsselector will match multipleServiceInterfaceobjects and fail withexpected only one serviceInterface. -
Stale objects — leftover
ServiceInterfaceobjects from a previousDPUServiceInterfacedeployment can satisfy the same selector, producing multiple matches on a node. Delete stale objects before re-deploying.
How to avoid these issues
-
Ensure each physical port is targeted by at most one
DPUServiceInterface. -
Verify that your
matchLabelsselector resolves to exactly oneServiceInterfaceon each target node. -
Check for stale
ServiceInterfaceobjects from previous deployments before applying a newDPUServiceChain.
Additional fields
DPUClusterSelector
The spec.dpuClusterSelector field is used to select which DPU clusters the chain configuration is applied to. It uses standard Kubernetes label selector syntax (matchLabels and matchExpressions) to match against DPUCluster labels. If not specified, the configuration is applied to all DPU clusters.
Both DPUServiceChain and DPUServiceInterface support dpuClusterSelector, allowing control over which clusters receive specific interface configurations.
Example:
spec:
dpuClusterSelector:
matchLabels:
environment: production
Last updated: