DOCA SDK Documentation

Container Deployment

Preparing the BlueField DPU

Set BlueField to DPU Mode

BlueField must run in DPU mode to use the DPL Runtime Service. For details how to change modes, see here: BlueField Modes of Operation.

Determine Your BlueField Variant

Your BlueField may be Installed in a host server or it may be a standalone server.
If your BlueField is a standalone server, please ignore the parts that mention the host server or SR-IOV.

You may still use Scalable Functions (SFs) if your BlueField is a standalone server.

Setup DPU Management Access and Update BlueField-Bundle

These pages provide detailed information about DPU management access and software installation and updates:

Systems with a Host Server typically use RShim (i.e. the tmfifo_net0 interface).
Standalone systems will have to use the OOB interface option for management access.

Port Configuration

Creating SR-IOV Virtual Functions (Host Server)

The first step to use SR-IOV is to create Virtual Functions (VFs) on the host server.

VFs can be created using the following sequence:

Bash
sudo -s  # enter sudo shell
echo 4 > /sys/class/net/eth2/device/sriov_numvfs
exit # exit sudo shell

Entering sudo shell rather than just issuing a single sudo  command is necessary because otherwise the sudo applies only to the echo command and not the hosting shell and the redirection fails with "Permission denied"

This example creates 4 VFs under Physical Function eth2. Please adjust according to your needs.

If a PF already has VFs and you'd like to change the number of VFs, please set it to 0 before applying the new value.

Scalable Functions (DPU)

For more information, see this: BlueField Scalable Function User Guide

If you create SFs, refer to their representors in the configuration file.

Install the DPL Runtime Service on the DPU

Pulling the Container Resources and Scripts from NGC

Start by downloading and installing the ngc-cli tools.

Fetch the configuration files from NGC, this will create a directory named dpl_rt_service_<version> .

e.g. dpl_rt_service_v1.0.0-doca2.10.0

Commands:

Bash
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.58.0/files/ngccli_arm64.zip -O ngccli_arm64.zip
unzip ngccli_arm64.zip
./ngc-cli/ngc registry resource download-version "nvidia/doca/dpl_rt_service"
cd dpl_rt_service_v1.0.0-doca2.10.0

Running the Preparation Script

Inside the directory with the scripts and YAML files that you pulled with the ngc-cli tool, you'll find scripts/dpl_dpu_setup.sh.

Running this script on the DPU (requires sudo) will allow the usage of SR-IOV Virtual-Function interfaces and create the directory structure of the configuration files in directory /etc/dpl_rt_service. In addition, the script will set "hugepages" and call the necessary mlxconfig commands to use DPL Runtime Service.

Run the following sequence of commands from the working directory you pulled with the ngc-cli tool:

Bash
chmod +x ./scripts/dpl_dpu_setup.sh
sudo ./scripts/dpl_dpu_setup.sh
sudo systemctl restart kubelet.service
sudo systemctl restart containerd.service

Restarting the services is necessary for the "hugepages" change to apply to them.

The following firmware settings are set by the setup script:

  • FLEX_PARSER_PROFILE_ENABLE=4

  • PROG_PARSE_GRAPH=true

  • SRIOV_EN=1

Edit the Configuration Files

Modify your configuration files as they are described here: Service Configuration

Important: you must create at least one device configuration under /etc/dpl_rt_service/devices.d/ . It's advisable to start by making a copy of file /etc/dpl_rt_service/devices.d/NAME.conf.template .
e.g.

Bash
cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf

Setting up the kubelet Pod

Now that everything is ready, copy the file configs/dpl_rt_service.yamlfrom the directory that you pulled with the ngc-cli into directory /etc/kubelet.d .

Please allow a few minutes for the image to be pulled and the pod to be started. you may check the progress with command  sudo journalctl -u kubelet --since -5m, make sure to scroll down to see the latest log lines.

When the image is pulled, you will see it by using the command sudo crictl images .

When the pod is loaded, you will see it by using the command sudo crictl pods .

When the DPL Runtime Service is successfully running inside the pod, you will be able to find the log file in /var/log/doca/dpl_rt_service/dpl_rtd.log 

Recap, Full Command Sequence

Bash
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.58.0/files/ngccli_arm64.zip -O ngccli_arm64.zip
unzip ngccli_arm64.zip
./ngc-cli/ngc registry resource download-version "nvidia/doca/dpl_rt_service"
cd dpl_rt_service_v1.0.0v1
chmod +x ./scripts/dpl_dpu_setup.sh
sudo ./scripts/dpl_dpu_setup.sh
sudo systemctl restart kubelet.service
sudo systemctl restart containerd.service

sudo cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf
##  Modify the configuration file /etc/dpl_rt_service/devices.d/1000.conf

sudo cp configs/dpl_rt_service.yaml /etc/kubelet.d/

The device ID and version numbers may be different in your case, please adapt as needed.


Last updated: