DOCA SDK Documentation

DOCA Telemetry Utils

This page provides instructions on the usage of the doca_telemetry_utils tool.

Supported hardware is NVIDIA® BlueField®-3, NVIDIA® ConnectX®-7, and above.

Introduction

The doca_telemetry_utils tool can be used to display all the available counters, and to generate counter IDs which can be used in other DOCA Tools. It also can get the counter type and settings from a given counter ID, and to inform whether this counter is available on the required hardware.

This tool simplifies counter management, making it easier to identify, configure, and verify counter support for specific devices.

doca_telemetry_utils relies on the fwctl driver and, therefore, probing hardware support will fail if run simultaneously with other tools or services that also utilize this driver.

Prerequisites

To utilize DOCA Telemetry Diag, your system must meet the following baseline requirements:

  • Firmware: Version >=28/32/40.43.1000 is required for ConnectX-7, BlueField-3, and ConnectX-8 devices.

  • Driver: The fwctl driver must be fully installed and actively loaded on the system.

Verifying the fwctl Driver

To verify that the fwctl driver is successfully loaded, check the device directories: 

$ ls /sys/class/fwctl/
$ ls /dev/fwctl

The expected output for a standard 2-port device is fwctl0 fwctl1.

Manually Loading the Driver

If the directories /sys/class/fwctl or /dev/fwctl do not exist or are empty, the module may be installed but inactive.

Check for the module's presence:

$ grep fwctl -R /lib/modules/$(uname -r)/

If the output confirms the presence of fwctl.ko and mlx5_fwctl.ko, manually load the module and verify its status:

$ sudo modprobe mlx5_fwctl
$ lsmod | grep fwctl

Reinstalling the DOCA Host Package

If you cannot locate the installed fwctl module while manually loading the driver, or if the modprobe command fails to load it successfully, you must reinstall the DOCA Host package.

  1. Download the package (DOCA 3.3.0 example):

    $ wget https://www.mellanox.com/downloads/DOCA/DOCA_v3.3.0/host/doca-host_3.3.0-088000-26.01-ubuntu2204_amd64.deb

  2. Purge existing DOCA and OFED modules:

    $ sudo for f in $( dpkg --list | grep doca | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done
    $ sudo for f in $( dpkg --list | grep mlnx | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done
    $ sudo for f in $( dpkg --list | grep dpdk | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done
    $ sudo for f in $( dpkg --list | grep ofed | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done
    $ sudo /usr/sbin/ofed_uninstall.sh --force
    $ sudo apt-get autoremove

  3. Install the new package and restart services:

    $ sudo dpkg -i doca-host_3.3.0-088000-26.01-ubuntu2204_amd64.deb
    $ sudo apt-get update
    $ sudo apt-get -y install doca-all
    $ sudo /etc/init.d/openibd restart
    Once the reinstallation is complete, confirm the module is successfully loaded according to section "DOCA Telemetry Utils | Verifying the fwctl Driver".

Installing DOCA Telemetry Utils

To install doca_telemetry_utils:

  • On deb-based distros, run:

    Bash
    sudo apt-get install doca-telemetry-utils 
    
  • On RPM-based distros, run: 

    Bash
    sudo dnf install doca-telemetry-utils 
    

Description

The doca_telemetry_utils tool can be used with either a counter name or a counter data ID.

Usage with Counter Name

When providing a counter name, doca_telemetry_utils displays the associated data ID and additional details:

The following is an example of running with the counter named global_icmc_hit:

$ doca_telemetry_utils global_icmc_hit
Data ID: 0x1180000200000000
Name: global_icmc_hit
Unit: ICMC

For counters requiring arguments, running the tool with only the counter name displays the options needed.

Running with port_rx_bytes:

[fill_data_id] Per-port counter 0x10200001 (port_rx_bytes) needs exactly 1 argument (local_port), 0 given.

 In this case, you must provide the required argument(s) and re-run the command.

Example of specifying the argument local_port:

$ doca_telemetry_utils port_rx_bytes 0
Data ID: 0x1020000100000000
Name: port_rx_bytes
Unit: RX_PORT
local_port: 0

Usage with Data ID

When providing a data ID, doca_telemetry_utils displays the counter name and other details:

Example of running with data ID 0x1180000200000000:

$ doca_telemetry_utils 0x1180000200000000
Data ID: 0x1180000200000000
Name: global_icmc_hit
Unit: ICMC

Checking Counter Support on a Device

To check whether a specific counter is supported by a particular device using the device's PCIe address.

To check if the global_icmc_hit counter is supported on device 08:00.0:

$ doca_telemetry_utils 08:00.0 0x1180000200000000
Data ID: 0x1180000200000000
Name: global_icmc_hit
Unit: ICMC
Data ID 0x1180000200000000 is supported on device 08:00.0

Execution

To run doca_telemetry_utils:

$ doca_telemetry_utils -h
Usage:
Name to Data ID:
	 doca_telemetry_utils [<device PCI>] <name> [relevant properties]
	• To get the options for 'relevant properties' run with 'name' alone.
Data ID to name:
	 doca_telemetry_utils [<device PCI>] <DATA_ID>

• If the optional argument <device PCI> is provided, this device will be tested for support of this counter.
• Run with option 'get-counters' to get all the available names.

-h, --help:	 Show this help message.

Examples:

  • Show all available counters:

    Bash
    doca_telemetry_utils get-counters
    
  • Name to data ID with relevant options:

    Bash
    doca_telemetry_utils port_rx_bytes 0
    
  • Name to Data ID with device PCIe:

    Bash
    doca_telemetry_utils 08:00.0 port_rx_bytes 0
    
  • Data ID to name:

    Bash
    doca_telemetry_utils 0x1020000100000000
    
  • Data ID to Name with device PCIe:

    Bash
    doca_telemetry_utils 08:00.0 0x1020000100000000
    

Last updated: